123 stars 8 forks source link

May I ask this error? #5

Open jungyeup opened 2 months ago

jungyeup commented 2 months ago

Could you tell me how to solve this problem?

(talk3d) F:\Talk3D>sh demo.sh No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8' @@@@@@@@@@@@@@@@@@@@@ @ Training Talk3D @ @@@@@@@@@@@@@@@@@@@@@ N_gpus: 1 No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8' Traceback (most recent call last): File "main.py", line 106, in spawn_mp(_main, world_size) File "main.py", line 39, in spawn_mp mp.spawn(running_fn,args=(world_size,),nprocs=world_size,join=True) File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\multiprocessing\spawn.py", line 240, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\multiprocessing\spawn.py", line 198, in start_processes while not context.join(): File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\multiprocessing\spawn.py", line 160, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\multiprocessing\spawn.py", line 69, in _wrap fn(i, *args) File "F:\Talk3D\main.py", line 29, in _main setup(rank, world_size,opts) File "F:\Talk3D\main.py", line 35, in setup distributed.init_process_group('nccl', rank=rank, world_size=world_size) File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\distributed\distributed_c10d.py", line 761, in init_process_group default_pg = _new_process_group_helper( File "C:\Users\User.conda\envs\talk3d\lib\site-packages\torch\distributed\distributed_c10d.py", line 886, in _new_process_group_helper raise RuntimeError("Distributed package doesn't have NCCL " "built in") RuntimeError: Distributed package doesn't have NCCL built in

mlnyang commented 2 months ago

Could you tell me how you installed the torch library?

It looks like your environment's CUDA and torch doesn't match. This installing script

pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

or other scripts in this link starting with pip install torch==x.xx.x+cu11x... such as

pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117

pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

might help, but I'm not sure.

Please let me know if the above script does not work.