nerfstudio-project / nerfacc

A General NeRF Acceleration Toolbox in PyTorch.
https://www.nerfacc.com/
Other
1.39k stars 115 forks source link

Encounter cuda error when training #84

Closed Learningm closed 2 years ago

Learningm commented 2 years ago

Hi, I encounter some cuda errors when I train the example, I tried making batch_size smaller but it doesn't work. Could you please figure out which part matters? Thanks.

python examples/train_mlp_nerf.py --train_split train --scene lego Traceback (most recent call last): File "/project/nerfacc/examples/train_mlp_nerf.py", line 169, in occupancy_grid.every_n_step( File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, kwargs) File "/project/nerfacc/nerfacc/grid.py", line 271, in every_n_step self._update( File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, *kwargs) File "/project/nerfacc/nerfacc/grid.py", line 229, in _update occ = occ_eval_fn(x).squeeze(-1) File "/project/nerfacc/examples/train_mlp_nerf.py", line 171, in occ_eval_fn=lambda x: radiance_field.query_opacity( File "/project/nerfacc/examples/radiance_fields/mlp.py", line 229, in query_opacity density = self.query_density(x) File "/project/nerfacc/examples/radiance_fields/mlp.py", line 237, in query_density sigma = self.mlp.query_density(x) File "/project/nerfacc/examples/radiance_fields/mlp.py", line 149, in query_density x = self.base(x) File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "/project/nerfacc/examples/radiance_fields/mlp.py", line 90, in forward x = self.hidden_layersi File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)

Learningm commented 2 years ago

I reinstall the environment using python=3.8 instead of 3.9. The error turns out to be "cuda out of memory". I tried another machine with larger gpu memory then resolved.