Open lightandshadow68 opened 3 months ago
Looks like this is due to manually installing NVDIA drivers and CUDA Toolkit on an EC2 created from a base AMI.
When I use a Ubuntu PyTorch AMI to create an EC2 instance, nvcc
matches, but now I'm receiving an error when BasicSR references torchvision.transforms.functional_tensor
File "/home/ubuntu/ml/GFPGAN/overlap_fb_retouch.py", line 6, in <module>
from gfpgan import GFPGANer
File "/home/ubuntu/ml/GFPGAN/gfpgan/__init__.py", line 2, in <module>
from .archs import *
File "/home/ubuntu/ml/GFPGAN/gfpgan/archs/__init__.py", line 2, in <module>
from basicsr.utils import scandir
File "/opt/conda/lib/python3.10/site-packages/basicsr/__init__.py", line 4, in <module>
from .data import *
File "/opt/conda/lib/python3.10/site-packages/basicsr/data/__init__.py", line 22, in <module>
_dataset_modules = [importlib.import_module(f'basicsr.data.{file_name}') for file_name in dataset_filenames]
File "/opt/conda/lib/python3.10/site-packages/basicsr/data/__init__.py", line 22, in <listcomp>
_dataset_modules = [importlib.import_module(f'basicsr.data.{file_name}') for file_name in dataset_filenames]
File "/opt/conda/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/opt/conda/lib/python3.10/site-packages/basicsr/data/realesrgan_dataset.py", line 11, in <module>
from basicsr.data.degradations import circular_lowpass_kernel, random_mixed_kernels
File "/opt/conda/lib/python3.10/site-packages/basicsr/data/degradations.py", line 8, in <module>
from torchvision.transforms.functional_tensor import rgb_to_grayscale
ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor'
Seems related to: https://github.com/TencentARC/GFPGAN/issues/539
I'm attempting to install BasicSR via pip and conda.
After setting the environment variable to enable extension compilation, when I build I receive the following error.
However, when I check the versions of the drivers and CUDA installed, I receive....
What's odd is that
nvcc
andNvidia-smi
do not seem to agree on the version of CUDA installed, or it's referring to the toolkit version, which is different than the actual CUDA api?