chengzeyi / stable-fast

Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
MIT License
1.16k stars 71 forks source link

'torch' module not found during 'stable-fast' installation #55

Closed Prof-Cheese closed 10 months ago

Prof-Cheese commented 10 months ago

Environment Information:

Problem Description: When trying to install the stable-fast package, I encounter a ModuleNotFoundError: No module named 'torch'. I have confirmed that torch is installed in the virtual environment using pip list. Also, no package conflicts were detected with pip check.

Attempted Solutions:

Additional Information: The same error occurs when installing directly from the GitHub repository and from PyPI.

I would appreciate any suggestions or solutions to this problem.

chengzeyi commented 10 months ago

@Prof-Cheese That sounds impossible. Can you confirm that torch is available in you env? What's the output of python3 -m torch.utils.collect_env and pip3 list?

Prof-Cheese commented 10 months ago

Thank you for your response. Here are the requested details:

Output of python3 -m torch.utils.collect_env:

Output of pip list:

I confirm that torch is installed in my environment. I hope this information helps in diagnosing the issue with installing stable-fast.

chengzeyi commented 10 months ago

@Prof-Cheese Are you sure that you invoke pip3, not pip?

Prof-Cheese commented 10 months ago

For thoroughness, I have taken screenshots of both pip3 list and pip list outputs. Additionally, I want to mention that when I rebuilt the venv, I switched Python from version 3.11 to 3.10, but unfortunately, the issue persisted with both versions.

image image
chengzeyi commented 10 months ago

@Prof-Cheese Take a look at this

https://stackoverflow.com/questions/32004958/python-module-not-found-in-virtualenv

Prof-Cheese commented 10 months ago

Problem Summary

Initially, I encountered an issue where torch was not recognized in my virtual environment. Following your advice and this Qiita article, I resolved the torch detection issue. However, now I'm facing a new problem.

/work/image/venv/lib/python3.10/site-packages/torch/cuda/init.py:138: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda-12.1'

Additional Observations

import torch
print(torch.cuda.is_available())
/work/image/venv/lib/python3.10/site-packages/torch/cuda/init.py:138: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
False

Request for Assistance

I am seeking advice or suggestions to resolve this CUDA error during the installation of stable-fast. Any help would be greatly appreciated.

chengzeyi commented 10 months ago

Do you have cuda toolkit installed on your system?

Prof-Cheese commented 10 months ago

Yes, I have the CUDA toolkit installed. Interestingly, the earlier error I mentioned stopped occurring after I updated the GPU drivers. However, now I'm encountering a new error related to GCC version incompatibility. Given these issues, I'm starting to feel that Arch Linux might not be the best fit for a server environment, so I'm considering trying a different Linux distribution. Thanks for your help and advice!

gameltb commented 10 months ago

Yes, I have the CUDA toolkit installed. Interestingly, the earlier error I mentioned stopped occurring after I updated the GPU drivers. However, now I'm encountering a new error related to GCC version incompatibility. Given these issues, I'm starting to feel that Arch Linux might not be the best fit for a server environment, so I'm considering trying a different Linux distribution. Thanks for your help and advice!

If you are using Arch , you may need to run export NVCC_PREPEND_FLAGS='-ccbin /usr/bin/g++-12' before building stable fast.