ashawkey / stable-dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Apache License 2.0
8.2k stars 720 forks source link

CUDA_HOME environment variable is not set #13

Open SpicyMelonYT opened 1 year ago

SpicyMelonYT commented 1 year ago

Hi,

When I try running this from the readme usage section: python main.py --text "a hamburger" --workspace trial -O

I get this error: OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

What should I do about this?

Here is the full error:

C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\amp\autocast_mode.py:198: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
Traceback (most recent call last):
  File "D:\CodeProjects\VisualStudioCode\Other\DreamFusion\stable-dreamfusion\raymarching\raymarching.py", line 10, in <module>
    import _raymarching as _backend
ModuleNotFoundError: No module named '_raymarching'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 84, in <module>
    from nerf.network_grid import NeRFNetwork
  File "D:\CodeProjects\VisualStudioCode\Other\DreamFusion\stable-dreamfusion\nerf\network_grid.py", line 6, in <module>
    from .renderer import NeRFRenderer
  File "D:\CodeProjects\VisualStudioCode\Other\DreamFusion\stable-dreamfusion\nerf\renderer.py", line 12, in <module>
    import raymarching
  File "D:\CodeProjects\VisualStudioCode\Other\DreamFusion\stable-dreamfusion\raymarching\__init__.py", line 1, in <module>
    from .raymarching import *
  File "D:\CodeProjects\VisualStudioCode\Other\DreamFusion\stable-dreamfusion\raymarching\raymarching.py", line 12, in <module>
    from .backend import _backend
  File "D:\CodeProjects\VisualStudioCode\Other\DreamFusion\stable-dreamfusion\raymarching\backend.py", line 31, in <module>
    _backend = load(name='_raymarching',
  File "C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\cpp_extension.py", line 1202, in load
    return _jit_compile(
  File "C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\cpp_extension.py", line 1425, in _jit_compile
    _write_ninja_file_and_build_library(
  File "C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\cpp_extension.py", line 1514, in _write_ninja_file_and_build_library        
    extra_ldflags = _prepare_ldflags(
  File "C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\cpp_extension.py", line 1617, in _prepare_ldflags
    extra_ldflags.append(f'/LIBPATH:{_join_cuda_home("lib/x64")}')
  File "C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\cpp_extension.py", line 2125, in _join_cuda_home
    raise EnvironmentError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
DaLizardWizard commented 1 year ago

What happens when you type "nvcc" into the command line? Or "nvidia-smi"? This might help https://github.com/facebookresearch/pytorch3d/issues/517

SpicyMelonYT commented 1 year ago

What happens when you type "nvcc" into the command line? Or "nvidia-smi"? This might help facebookresearch/pytorch3d#517

When I type nvcc, it gives an error "The term 'nvcc' is not recognized". When I type nvidia-smi i do get something: A bunch of data pops up with a lot of N/A's. I don't want to link it yet cause I don't know if it shows any personal information that I shouldn't sent on the internet!

SpicyMelonYT commented 1 year ago

I also tried to follow the link you shared but it wants me to set CUDE_HOME variable in system variable but it wants it set to a path that I seem not to have. Path: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin I don't have that path. I don't have a folder called "NVIDIA GPU Computing Toolkit"

asitanc commented 1 year ago

That may be the problem. You are missing the CUDA toolkit. This happens when you either don't have the it installed or you don't have NVIDIA GPU (e.g. Mac does not have one and therefore you will get this error)

SpicyMelonYT commented 1 year ago

That may be the problem. You are missing the CUDA toolkit. This happens when you either don't have the it installed or you don't have NVIDIA GPU (e.g. Mac does not have one and therefore you will get this error)

Fortunately I have an NVIDIA GPU in my computer. I could try to install the toolkit but I removed the underscores in the code and it went a lot farther this time. I got to a point where it was downloading things it needed and then it even attempted to run the training process but then this error came up:

[INFO] loaded stable diffusion!
C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\cuda\amp\grad_scaler.py:115: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.
  warnings.warn("torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.")
[INFO] Trainer: ngp | 2022-10-07_11-29-36 | cpu | fp16 | trial
[INFO] #parameters: 12248183
[INFO] Loading latest checkpoint ...
[WARN] No checkpoint found, model randomly initialized.
==> Start Training trial Epoch 1, lr=0.010000 ...
  0% 0/100 [00:00<?, ?it/s]C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\amp\autocast_mode.py:198: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
Traceback (most recent call last):
  File "main.py", line 145, in <module>
    trainer.train(train_loader, valid_loader, max_epoch)
  File "D:\CodeProjects\VisualStudioCode\Other\DreamFusion\stable-dreamfusion\nerf\utils.py", line 453, in train
    self.train_one_epoch(train_loader)
  File "D:\CodeProjects\VisualStudioCode\Other\DreamFusion\stable-dreamfusion\nerf\utils.py", line 665, in train_one_epoch
    self.model.update_extra_state()
  File "C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "D:\CodeProjects\VisualStudioCode\Other\DreamFusion\stable-dreamfusion\nerf\renderer.py", line 574, in update_extra_state
    indices = raymarching.morton3D(coords).long() # [N]
  File "D:\CodeProjects\VisualStudioCode\Other\DreamFusion\stable-dreamfusion\raymarching\raymarching.py", line 94, in forward
    if not coords.is_cuda: coords = coords.cuda()
  File "C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\cuda\__init__.py", line 211, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
  0% 0/100 [00:00<?, ?it/s]

Would this error be because of the missing tool kit still?

asitanc commented 1 year ago

I think that removing the underscore wont fix your issue. The issue is still the same:

C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\cuda\amp\grad_scaler.py:115: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.
  warnings.warn("torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.")

If you try running:

torch.cuda.is_available()

It should throw out True, otherwise you dont have CUDA available (maybe still not the toolkit available).

SpicyMelonYT commented 1 year ago

I suspect that what I changed is still good since even if I had got the CUDA issue solved, those would still pop up. But maybe not.

Also where do I find the tool kit and the correct version? And is there any special things I have to do to install it or does it come with an installer?

SpicyMelonYT commented 1 year ago

I think that removing the underscore wont fix your issue. The issue is still the same:

C:\Users\Matthew\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\cuda\amp\grad_scaler.py:115: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.
  warnings.warn("torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.")

If you try running:

torch.cuda.is_available()

It should throw out True, otherwise you dont have CUDA available (maybe still not the toolkit available).

Also where do I find the tool kit and the correct version? And is there any special things I have to do to install it or does it come with an installer?

asitanc commented 1 year ago

it is available on NVIDIA website, CUDA Toolkit https://developer.nvidia.com/cuda-downloads you may want to look for CUDA 11.6

SpicyMelonYT commented 1 year ago

So no that didn't fix it. I think I will just wait till someone makes a more accessible version or someone makes a website to run it in, even a google colabs project would work!

BMaxV commented 1 year ago

Hello, I think I have the same issue.

OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

If you try running:

torch.cuda.is_available()

returns True.

Homework I did:

I can find this:

locate cuda | grep /cuda

/home/myusername/.local/lib/python3.8/site-packages/nvidia/cuda_runtime/
/home/myusername/.local/lib/python3.8/site-packages/torch/cuda/

and this is how it's set? export CUDA_HOME=/usr/local/cuda-X.X

But will this point to the correct thing?

Seems to be a strangely common thing with these kinds of topics.


Ok, so the error is cause by gridencoder not being there and that's not being built because nvcc is missing?