Open amiltonwong opened 1 year ago
I think this problem happens when you use other Dassl package. We made some changes to the original Dassl lib.
To solve this, please conduct:
cd PointCLIP_V2/zeroshot_cls/Dassl3D
python setup.py develop
cd ..
bash sh zeroshot_cls.sh
I think this may help.
@yangyangyang127 , thanks a lot for your reply. After compiling Dassl3D as listed above, this issue is passed.
However, another issue occurs: RuntimeError: No CUDA GPUs are available
(pointclip_new) milton@milton-ws3:/data/code13/PointCLIP_V2/zeroshot_cls$ sh zeroshot_cls.sh
/home/milton/anaconda3/envs/pointclip_new/lib/python3.8/site-packages/scipy/__init__.py:138: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.23.5)
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion} is required for this version of "
/data/code13/PointCLIP_V2/zeroshot_cls/clip/clip.py:23: UserWarning: PyTorch version 1.7.1 or higher is recommended
warnings.warn("PyTorch version 1.7.1 or higher is recommended")
Setting fixed seed: 2
Collecting env info ...
** System info **
PyTorch version: 1.10.1
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.2 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.16.3
Libc version: glibc-2.31
Python version: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.8.0-43-generic-x86_64-with-glibc2.17
Is CUDA available: False
CUDA runtime version: 11.3.58
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3080 Ti
Nvidia driver version: 470.57.02
cuDNN version: Probably one of the following:
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn.so.8
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_adv_train.so.8
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8
/usr/local/cuda-11.3/targets/x86_64-linux/lib/libcudnn_ops_train.so.8
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.24.2
[pip3] torch==1.10.1
[pip3] torch-cluster==1.6.0
[pip3] torch-scatter==2.0.9
[pip3] torch-sparse==0.6.13
[pip3] torchaudio==0.10.1
[pip3] torchvision==0.11.2
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.3.1 h2bc3f7f_2
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h8d4b97c_729 conda-forge
[conda] mkl-service 2.4.0 py38h95df7f1_0 conda-forge
[conda] mkl_fft 1.3.1 py38h8666266_1 conda-forge
[conda] mkl_random 1.2.2 py38h1abd341_0 conda-forge
[conda] numpy 1.24.2 pypi_0 pypi
[conda] numpy-base 1.23.5 py38h31eccc5_0
[conda] pytorch 1.10.1 py3.8_cuda11.3_cudnn8.2.0_0 pytorch
[conda] pytorch-cluster 1.6.0 py38_torch_1.10.0_cu113 pyg
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] pytorch-scatter 2.0.9 py38_torch_1.10.0_cu113 pyg
[conda] pytorch-sparse 0.6.13 py38_torch_1.10.0_cu113 pyg
[conda] torchaudio 0.10.1 py38_cu113 pytorch
[conda] torchvision 0.11.2 py38_cu113 pytorch
Pillow (9.4.0)
Loading trainer: PointCLIPV2_ZS
Loading dataset: ModelNet40
/home/milton/anaconda3/envs/pointclip_new/lib/python3.8/site-packages/torch/utils/data/dataloader.py:478: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
***** Dataset statistics *****
Dataset: ModelNet40
# classes: 40
# train_x: 9,840
# val: 2,468
# test: 2,468
Loading CLIP (backbone: ViT-B/16)
100%|███████████████████████████████████████| 351M/351M [00:30<00:00, 11.6MiB/s]
Traceback (most recent call last):
File "main.py", line 131, in <module>
main(args)
File "main.py", line 97, in main
trainer = build_trainer(cfg)
File "/data/code13/PointCLIP_V2/zeroshot_cls/Dassl3D/dassl/engine/build.py", line 11, in build_trainer
return TRAINER_REGISTRY.get(cfg.TRAINER.NAME)(cfg)
File "/data/code13/PointCLIP_V2/zeroshot_cls/Dassl3D/dassl/engine/trainer.py", line 280, in __init__
self.build_model()
File "/data/code13/PointCLIP_V2/zeroshot_cls/trainers/zeroshot.py", line 49, in build_model
clip_model.cuda()
File "/home/milton/anaconda3/envs/pointclip_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 680, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/milton/anaconda3/envs/pointclip_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 570, in _apply
module._apply(fn)
File "/home/milton/anaconda3/envs/pointclip_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 570, in _apply
module._apply(fn)
File "/home/milton/anaconda3/envs/pointclip_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 593, in _apply
param_applied = fn(param)
File "/home/milton/anaconda3/envs/pointclip_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 680, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "/home/milton/anaconda3/envs/pointclip_new/lib/python3.8/site-packages/torch/cuda/__init__.py", line 214, in _lazy_init
torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
I've checked my GPU using torch.cuda.is_available()
, and it returns True. My system has a single RTX3080TI GPU, with CUDA 11.3 installed. Is the reason that the package requires multiple GPU training?
Any hints to solve this issue?
Thanks~
@yangyangyang127 , I found the reason. I've change export CUDA_VISIBLE_DEVICES=0
(in zeroshot_cls.sh) to adapt for my system environment.
Now it works.
Thanks~
Hi, @ZrrSkywalker @yangyangyang127 ,
Thanks a lot for releasing the V2 package. I've tried running
zeroshot_cls
. However, when I runsh zeroshot_cls.sh
, I got the following KeyError.It seems there's some issue in the config file. Could you give some hints to fix this issue?
Thanks~