libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.79k stars 180 forks source link

Error when using install ffcv using guidance command. #272

Open HaoKang-Timmy opened 1 year ago

HaoKang-Timmy commented 1 year ago

Hi, I am trying to install ffcv on my server. Here are the server settings.

Platform:amd-linux
GPU: 1*RTX02080Ti

When I use the default code to create the environment, and set up the environment, I could not move data to gpu and an error occurred. Here is the error:

Torch not compiled with CUDA enabled

I have installed GPU driver on my server. Could you please tell me why this happens?

Morales97 commented 1 year ago

Same issue here, torch.cuda.is_available() will return False in the conda env where I installed ffcv, while I can use GPU normally in any other environment.

My conda env was created following the instructions as conda create -y -n ffcv python=3.9 cupy pkg-config compilers libjpeg-turbo opencv pytorch torchvision cudatoolkit=11.3 numba -c pytorch -c conda-forge. Then I installed ffcv with pip install ffcv.

andrewilyas commented 1 year ago

Hi! What versions of CUDA do you have installed? And what does torch.__version__ return?

arnaghosh commented 1 year ago

Hi @andrewilyas,

I was having the same problem. torch.__version__ returns 1.13.1. Would you recommend installing torch 1.10 instead (this was a previous configuration for which ffcv worked for me)?

Thanks in advance!

dngfra commented 1 year ago

I had the same issue creating the conda env using the instructions. This env seems to work for me conda create -y -n ffcv python=3.9 cupy pkg-config compilers libjpeg-turbo opencv pytorch torchvision torchaudio pytorch-cuda=11.7 numba -c pytorch -c conda-forge -c nvidia

arnaghosh commented 1 year ago

Thanks for the update @dngfra, it seems that the issue was a mismatch between pytorch and cuda versions.

lucasresck commented 11 months ago

The complete command (as of August 2023) is

conda create -n ffcv python=3.9 cupy pkg-config libjpeg-turbo opencv pytorch torchvision cudatoolkit=11.6 numba -c conda-forge -c pytorch && conda activate ffcv && conda update ffmpeg && pip install ffcv

When running the part

conda update ffmpeg

I see:

The following packages will be DOWNGRADED:

  python_abi                                     3.9-3_cp39 --> 3.9-2_cp39 
  pytorch                     2.0.0-cuda112py39ha9981d0_200 --> 2.0.0-cpu_generic_py39h000fad7_1 
  torchvision                  0.15.2-cuda112py39h22a746e_1 --> 0.15.2-cpu_py39hcf778cf_1 

Notice pytorch is being substituted by its CPU version.

In my case, not updating ffmpeg solved the issue.