Jumpat / SegmentAnythingin3D

Segment Anything in 3D with NeRFs (NeurIPS 2023)
Apache License 2.0
900 stars 54 forks source link

Runtime Error on MipNeRF-360 dataset #60

Open 2085924055 opened 9 months ago

2085924055 commented 9 months ago

When I run the following commond : Python run.py --config=configs/llff/kitchen.py --stop_at=20000 --render_video --i_weights=10000

I get this error : File "G:\SegmentAnythingin3D-master\lib\grid.py", line 171, in init self.xy_plane = nn.Parameter(torch.randn([1, Rxy, X, Y]) * 0.1) RuntimeError: CUDA out of memory. Tried to allocate 2356.25 GiB (GPU 0; 23.99 GiB total capacity; 25.00 KiB already allocated; 22.04 GiB free; 2.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

The error occurs on line 171 of the lib\grid.py file, where an attempt is made to allocate memory to self.xy_plane. It seems to be trying to use random initialization to create a tensor of the shape [1, Rxy, X, Y], but the allocated memory size is unusually large.

Did anyone face this issue ?

Jumpat commented 9 months ago

Hi! Did you change any code? The '2356.25 GiB' to be allocated seems a little weird. This seems to be caused by an unexpected broadcast operation.

2085924055 commented 9 months ago

Thank you for your reply. I did not change the code. I found that there should be a problem with the command I used, I should use the following command. Python run.py --config=configs/nerf_unbounded/kitchen.py --stop_at=20000 --render_video --i_weights=10000 Although I don't know why the above command has this problem, I think what I want is the effect of the following command.

2085924055 commented 9 months ago

Which command should be used if running MipNeRF-360 dataset?llff/kitchen.py or nerf_unbounded/kitchen.py I don't really understand

2085924055 commented 9 months ago

When I run the following commond : Python run.py --config=configs/nerf_unbounded/kitchen.py --stop_at=20000 --render_video --i_weights=10000

I get this error : File "G:\SegmentAnythingin3D-master\run.py", line 690, in train(args, cfg, data_dict) File "G:\SegmentAnythingin3D-master\run.py", line 621, in train scene_rep_reconstruction( File "G:\SegmentAnythingin3D-master\run.py", line 520, in scene_rep_reconstruction loss_distortion = flatten_eff_distloss(w, s, 1/n_max, ray_id) File "D:\Anaconda3\envs\SA3D\lib\site-packages\torch_efficient_distloss\eff_distloss.py", line 93, in forward segment_cumsum_cuda = load( File "D:\Anaconda3\envs\SA3D\lib\site-packages\torch\utils\cpp_extension.py", line 1202, in load return _jit_compile( File "D:\Anaconda3\envs\SA3D\lib\site-packages\torch\utils\cpp_extension.py", line 1450, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "D:\Anaconda3\envs\SA3D\lib\site-packages\torch\utils\cpp_extension.py", line 1844, in _import_module_from_library module = importlib.util.module_from_spec(spec) File "", line 571, in module_from_spec File "", line 1176, in create_module File "", line 241, in _call_with_frames_removed ImportError: DLL load failed while importing segment_cumsum_cuda: 找不到指定的模块。

The above problem occurs when the above command is run on the 360_v2 data set. In nerf_llff_data data can run properly and generate Fly-through videos. So I don't really understand why this is happening, doesn't the fact that it works on the nerf_llff_data dataset mean that the environment is good? Why can't I find cuda extension module?

Jumpat commented 9 months ago

Which command should be used if running MipNeRF-360 dataset?llff/kitchen.py or nerf_unbounded/kitchen.py I don't really understand

You may need to use the nerf_unbounded config as the MIP360 dataset involves several unbounded in-the-wild scenes.

Here I found some similar issues with your missed module problem. You can check whether they can help you. In practice we have never met such problem before.

2085924055 commented 9 months ago

Thank you very much for your reply.

2085924055 commented 9 months ago

When I run the following commond : Python run_seg_gui.py --config=configs/nerf_unbounded/seg_kitchen.py --segment --sp_name=_gui --num_prompts=20 --render_opt=train --save_ckpt

I get this error : Traceback (most recent call last): File "E:\pycode\SegmentAnythingin3D\run_seg_gui.py", line 106, in train_seg(args, cfg, data_dict) File "E:\pycode\SegmentAnythingin3D\run_seg_gui.py", line 55, in train_seg gui.run() File "E:\pycode\SegmentAnythingin3D\lib\gui.py", line 58, in run init_rgb = self.Seg3d.init_model() File "E:\pycode\SegmentAnythingin3D\lib\sam3d.py", line 89, in init_model assert reload_ckpt_path is not None and 'segmentation must based on a pretrained NeRF' AssertionError

The error you're encountering is an AssertionError, which means that a specific assertion condition failed in the code. In this case, the error occurred in the init_model function within the sam3d.py file at line 89. The assertion condition assert reload_ckpt_path is not None and 'segmentation must based on a pretrained NeRF' comprises two parts: (1) reload_ckpt_path is not None: This part requires that the reload_ckpt_path variable is not empty. (2) 'segmentation must based on a pretrained NeRF': This is part of the error message and indicates that the segmentation task must be based on a pretrained NeRF model.

I don't know what to do to solve this problem.

Jumpat commented 9 months ago

Have you run the run.py successfully to get the pertained NeRF model? Maybe you can check whether the reload_ckpt_path has the corresponding NeRF model (like fine_last.tar).

2085924055 commented 9 months ago

yes. 1709629694024

2085924055 commented 9 months ago

1709629617836

Jumpat commented 9 months ago

I guess this is caused by the missing 'c' in the config file 'seg_kitchen.py'. I mean the expname = 'dvgo_kitchen_unbounded' should be 'dcvgo_kitchen_unbounded'.

2085924055 commented 9 months ago

yes, thank you very much for your reply. Now it's ready to run.

2085924055 commented 8 months ago

hello,I would like to ask where the whole model framework is and how to understand it. There seems to be no clear framework for NeRF in the entire code. What should I do if I want to modify NeRF.

Jumpat commented 8 months ago

Hi! You can find the code about NeRF in lib/dvgo.py (dcvgo, seg_dvgo, ...)

hello,I would like to ask where the whole model framework is and how to understand it. There seems to be no clear framework for NeRF in the entire code. What should I do if I want to modify NeRF.

2085924055 commented 8 months ago

thanks, I get it. I will be careful to understand the code logic. One question I would like to ask is how to get the code to run on my computer if there is a lack of video memory when running on my computer. Because I don't see where the batch_size can be adjusted. I originally ran it on a different server. But now I want to transfer the code to my computer, which is more convenient.

Jumpat commented 8 months ago

There is no batch size in SA3D. You can reduce the resolution of TensoRF (mask grids resolution, TensoRF grids resolution, rendering resolution, density grids resolution, ...) for saving memory.

2085924055 commented 7 months ago

hello, I would like to ask if the parameters in the code are already optimal? Do you still need hyperparameter optimization?

Jumpat commented 7 months ago

hello, I would like to ask if the parameters in the code are already optimal? Do you still need hyperparameter optimization?

For some scenes and targets it is. However it depends on the concrete scene and target you choose.

2085924055 commented 7 months ago

Ok, thank you very much for your reply.

2085924055 commented 7 months ago

hello, I would like to ask which NeRF article is based on ?

2085924055 commented 7 months ago

hello, I would like to ask which NeRF article is based on ?

Jumpat commented 7 months ago

hello, I would like to ask which NeRF article is based on ?

The main branch of SA3D is based on TensoRF.

NerfStudio branch is based on Nerfecto.

SA3D-GS branch is based on 3D-GS.

2085924055 commented 7 months ago

ok, thanks for you reply.