Segmentation fault - Githubissues

asalan570 commented 1 week ago

Hi, I always encounter the error of [Segmentation fault] when executing [train_full_pipel. py] and [extract_mesh.py].

WSL-Ubuntu 2204

i7-12700KF

32GB RAM

4060 8G

commands ：

python extract_mesh.py -s /mnt/d/Project/3DGS/tandt_db/tandt/truck/ -c ./output/vanilla_gs/truck/ -m /opt/project/SuGaR/output/coarse/truck/15000.pt -o /opt/project/SuGaR/output/coarse_mesh/truck/

info:

Using original 3DGS rasterizer from Inria.
-----Parameters-----
Source path: /mnt/d/Project/3DGS/tandt_db/tandt/truck/
Gaussian Splatting Checkpoint path: ./output/vanilla_gs/truck/
Coarse model Checkpoint path: /opt/project/SuGaR/output/coarse/truck/15000.pt
Mesh output path: /opt/project/SuGaR/output/coarse_mesh/truck/
Surface levels:
[0.1, 0.3, 0.5]
Decimation targets:
[200000, 1000000]
Project mesh on surface points: True
Use custom bbox: False
Use eval split: True
GPU: 0
Use centers to extract mesh: False
Use marching cubes: False
Use vanilla 3DGS: False
--------------------
Loading the initial 3DGS model from path ./output/vanilla_gs/truck/...
Found image extension .jpg
219 training images detected.
The model has been trained for 7000 steps.

Loading the coarse SuGaR model from path /opt/project/SuGaR/output/coarse/truck/15000.pt...
/opt/project/SuGaR/sugar_extractors/coarse_mesh.py:169: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(sugar_checkpoint_path, map_location=nerfmodel.device)
Use min to initialize scales.
Initialized radiuses for 3D Gauss Rasterizer
Coarse model loaded.
Coarse model parameters:
_points
torch.Size([428321, 3])
True
all_densities
torch.Size([428321, 1])
True
_scales
torch.Size([428321, 3])
True
_quaternions
torch.Size([428321, 4])
True
_sh_coordinates_dc
torch.Size([428321, 1, 3])
True
_sh_coordinates_rest
torch.Size([428321, 15, 3])
True
Number of gaussians: 428321
Opacities min/max/mean: tensor(8.8021e-05, device='cuda:0') tensor(1., device='cuda:0') tensor(0.6500, device='cuda:0')
Quantile 0.0: 8.802103548077866e-05
Quantile 0.1: 0.009140754118561745
Quantile 0.2: 0.05523267015814781
Quantile 0.3: 0.3236162066459656
Quantile 0.4: 0.6969544291496277
Quantile 0.5: 0.9207080602645874
Quantile 0.6: 0.9949575066566467
Quantile 0.7: 0.9994446635246277
Quantile 0.8: 0.999866247177124
Quantile 0.9: 0.9999637603759766

Starting pruning low opacity gaussians...
WARNING! During optimization, you should use a densifier to prune low opacity points.
This function does not preserve the state of an optimizer, and sets requires_grad=False to all parameters.
Number of gaussians left: 281421
Opacities min/max/mean: tensor(0.5000, device='cuda:0') tensor(1., device='cuda:0') tensor(0.9334, device='cuda:0')
Quantile 0.0: 0.5000006556510925
Quantile 0.1: 0.7226073741912842
Quantile 0.2: 0.8766904473304749
Quantile 0.3: 0.9686402678489685
Quantile 0.4: 0.9957362413406372
Quantile 0.5: 0.9990841150283813
Quantile 0.6: 0.9996886253356934
Quantile 0.7: 0.999871015548706
Quantile 0.8: 0.9999444484710693
Quantile 0.9: 0.9999788999557495
Processing frame 0/219...
Current point cloud for level 0.1 has 0 points.
Current point cloud for level 0.3 has 0 points.
Current point cloud for level 0.5 has 0 points.
Processing frame 30/219...
Current point cloud for level 0.1 has 1369890 points.
Current point cloud for level 0.3 has 1369890 points.
Current point cloud for level 0.5 has 1369890 points.
Processing frame 60/219...
Current point cloud for level 0.1 has 2739780 points.
Current point cloud for level 0.3 has 2739780 points.
Current point cloud for level 0.5 has 2739780 points.
Processing frame 90/219...
Current point cloud for level 0.1 has 4109670 points.
Current point cloud for level 0.3 has 4109670 points.
Current point cloud for level 0.5 has 4109670 points.
Processing frame 120/219...
Current point cloud for level 0.1 has 5479560 points.
Current point cloud for level 0.3 has 5479560 points.
Current point cloud for level 0.5 has 5479560 points.
Processing frame 150/219...
Current point cloud for level 0.1 has 6849450 points.
Current point cloud for level 0.3 has 6849450 points.
Current point cloud for level 0.5 has 6849450 points.
Processing frame 180/219...
Current point cloud for level 0.1 has 8219340 points.
Current point cloud for level 0.3 has 8219340 points.
Current point cloud for level 0.5 has 8219340 points.
Processing frame 210/219...
Current point cloud for level 0.1 has 9589230 points.
Current point cloud for level 0.3 has 9589230 points.
Current point cloud for level 0.5 has 9589230 points.

========== Processing surface level 0.1 ==========
Final point cloud for level 0.1 has 10000197 points.
Using default, camera based bounding box.
Centering bounding box.
Foreground points:
torch.Size([7418686, 3])
torch.Size([7418686, 3])
torch.Size([7418686, 3])
Background points:
torch.Size([1922012, 3])
torch.Size([1922012, 3])
torch.Size([1922012, 3])

-----Foreground mesh-----
Computing points, colors and normals...
Cleaning Point Cloud...
Finished computing points, colors and normals.
Now computing mesh...
[WARNING] /root/Open3D/build/poisson/src/ext_poisson/PoissonRecon/Src/FEMTree.Initialize.inl (Line 193)
          Initialize
          Found bad data: 6
[WARNING] /root/Open3D/build/poisson/src/ext_poisson/PoissonRecon/Src/FEMTree.IsoSurface.specialized.inl (Line 1858)
          Extract
          bad average roots: 2
Removing vertices with low densities...

-----Background mesh-----
Computing points, colors and normals...
Cleaning Point Cloud...
Finished computing points, colors and normals.
Now computing mesh...
[WARNING] /root/Open3D/build/poisson/src/ext_poisson/PoissonRecon/Src/FEMTree.Initialize.inl (Line 193)
          Initialize
          Found bad data: 280
Removing vertices with low densities...
Finished computing meshes.
Foreground mesh: TriangleMesh with 1738328 points and 3419226 triangles.
Background mesh: TriangleMesh with 2341679 points and 4606298 triangles.

-----Decimating and cleaning meshes-----

Processing decimation target: 200000
Decimating foreground mesh...
Finished decimating foreground mesh.
Decimating background mesh...
Finished decimating background mesh.
Cleaning mesh...
Merging foreground and background meshes.
Projecting mesh on surface points to recover better details...
Segmentation fault

asalan570 commented 1 week ago

Maybe I don't have enough GPU memory?

Anttwo commented 1 week ago

Hello @asalan570,

Indeed, it's possible that 8GB VRAM for the GPU is not enough. As this is research code, it's not fully optimized and probbaly requires a bit more memory than what you have (I think 12GB should be enough). I'm sorry for that.

However, I see that you get an error during the mesh projection part, which is not supposed to take that much memory, so I'm a bit surprised... Let's investigate that!

Could you try rerunning the extract_mesh.py script with the option --project_mesh_on_surface_points False? This will skip the mesh projection and probably remove your error. However, please note that the mesh projection greatly improves the quality of the mesh: It consists in reprojecting the vertices of the Poisson mesh on the dense surface point cloud sampled from Gaussians. This is very helpful and helps reducing the number of artifacts in the Poisson mesh.

If the problem comes from the mesh projection, then I will try to update the code and propose a mesh projection done "chunk by chunk" so that it uses less memory.

asalan570 commented 1 week ago

Hello @asalan570,

Indeed, it's possible that 8GB VRAM for the GPU is not enough. As this is research code, it's not fully optimized and probbaly requires a bit more memory than what you have (I think 12GB should be enough). I'm sorry for that.

However, I see that you get an error during the mesh projection part, which is not supposed to take that much memory, so I'm a bit surprised... Let's investigate that!

Could you try rerunning the extract_mesh.py script with the option --project_mesh_on_surface_points False? This will skip the mesh projection and probably remove your error. However, please note that the mesh projection greatly improves the quality of the mesh: It consists in reprojecting the vertices of the Poisson mesh on the dense surface point cloud sampled from Gaussians. This is very helpful and helps reducing the number of artifacts in the Poisson mesh.

If the problem comes from the mesh projection, then I will try to update the code and propose a mesh projection done "chunk by chunk" so that it uses less memory.

Thanks for your reply!

Just yesterday when I tried to extract the model on another computer (4070 12G), I also reported a Segmentation fault (core dumped).

I saw your reply this morning and tried the parameter :--project_mesh_on_surface_points False and successfully extracted the mesh!

Thank you again!

asalan570 commented 3 days ago

@Anttwo Hello, author!

I always encounter a Segmentation fault when I executetrain.py or train_full_pipeline.py.

The extract_mesh.py was successfully run using the --project_mesh_on_surface_points False parameter using your comments. However, the '.ply 'file is only generated, and the'.obj 'file is not obtained.

When I try to use python extract_refined_mesh_with_texture.py -s /mnt/d/Project/res-gaussian/data/penhu/ -c /opt/project/SuGaR/output/vanilla_gs/penhu/ - m /opt/project/SuGaR/output/coarse/penhu/sugarcoarse_3Dgs7000_densityestim02_sdfnorm02/15000. Pt, this time an error: Traceback (most recent call last): File "/opt/project/SuGaR/extract_refined_mesh_with_texture.py", line 45, in <module> extract_mesh_and_texture_from_refined_sugar(args) File "/opt/project/SuGaR/sugar_extractors/refined_mesh.py", line 32, in extract_mesh_and_texture_from_refined_sugar n_gaussians_per_surface_triangle = int(refined_model_path.split('/')[-2].split('_gaussperface')[-1]) ValueError: invalid literal for int() with base 10: 'sugarcoarse_3Dgs7000_densityestim02_sdfnorm02'

How do I use Sugar to generate the '.obj' model correctly?

Anttwo / SuGaR

Segmentation fault #221