Hi, I encounter some cuda errors when I train the example, I tried making batch_size smaller but it doesn't work. Could you please figure out which part matters? Thanks.
python examples/train_mlp_nerf.py --train_split train --scene lego
Traceback (most recent call last):
File "/project/nerfacc/examples/train_mlp_nerf.py", line 169, in
occupancy_grid.every_n_step(
File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, kwargs)
File "/project/nerfacc/nerfacc/grid.py", line 271, in every_n_step
self._update(
File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "/project/nerfacc/nerfacc/grid.py", line 229, in _update
occ = occ_eval_fn(x).squeeze(-1)
File "/project/nerfacc/examples/train_mlp_nerf.py", line 171, in
occ_eval_fn=lambda x: radiance_field.query_opacity(
File "/project/nerfacc/examples/radiance_fields/mlp.py", line 229, in query_opacity
density = self.query_density(x)
File "/project/nerfacc/examples/radiance_fields/mlp.py", line 237, in query_density
sigma = self.mlp.query_density(x)
File "/project/nerfacc/examples/radiance_fields/mlp.py", line 149, in query_density
x = self.base(x)
File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(input, kwargs)
File "/project/nerfacc/examples/radiance_fields/mlp.py", line 90, in forward
x = self.hidden_layersi
File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)
I reinstall the environment using python=3.8 instead of 3.9. The error turns out to be "cuda out of memory". I tried another machine with larger gpu memory then resolved.
Hi, I encounter some cuda errors when I train the example, I tried making batch_size smaller but it doesn't work. Could you please figure out which part matters? Thanks.
python examples/train_mlp_nerf.py --train_split train --scene lego Traceback (most recent call last): File "/project/nerfacc/examples/train_mlp_nerf.py", line 169, in
occupancy_grid.every_n_step(
File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, kwargs)
File "/project/nerfacc/nerfacc/grid.py", line 271, in every_n_step
self._update(
File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "/project/nerfacc/nerfacc/grid.py", line 229, in _update
occ = occ_eval_fn(x).squeeze(-1)
File "/project/nerfacc/examples/train_mlp_nerf.py", line 171, in
occ_eval_fn=lambda x: radiance_field.query_opacity(
File "/project/nerfacc/examples/radiance_fields/mlp.py", line 229, in query_opacity
density = self.query_density(x)
File "/project/nerfacc/examples/radiance_fields/mlp.py", line 237, in query_density
sigma = self.mlp.query_density(x)
File "/project/nerfacc/examples/radiance_fields/mlp.py", line 149, in query_density
x = self.base(x)
File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call( input, kwargs)
File "/project/nerfacc/examples/radiance_fields/mlp.py", line 90, in forward
x = self.hidden_layersi
File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling
cublasCreate(handle)