Closed TXSevenXT closed 2 months ago
Hi,
It seems like it could be a memory issue when exporting the ply file: https://github.com/graphdeco-inria/gaussian-splatting/issues/235 A quick fix that they suggest is saving the ply only after 30k iterations and not after 7k. To do that, add the --GS_save_test_iterations 30000 argument and see if it helps.
Thank you very much for your help ^^
After activating the backup memory in the nvidia settings, no more worries with the previous error, thank you :)
However, something is missing to extract the meshes: [ITER 30000] Evaluating train: L1 0.00707147466018796 PSNR 39.55408630371094 [04/09 17:35:16]
[ITER 30000] Saving Gaussians [04/09 17:35:16]
Training complete. [04/09 17:35:23] num views: 3 baseline: 0.27544912783794784 Loading trained model at iteration 30000 Reading camera 3/3 Loading Training Cameras Loading Test Cameras 0%| | 0/3 [00:00<?, ?it/s]UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1716905971873/work/aten/src/ATen/native/TensorShape.cpp:3587.) UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. 100%|█████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:16<00:00, 5.43s/it] Automask must be enabled for masking in script mode. Skipping. 100%|█████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 4.72it/s] [Open3D WARNING] Write PLY failed: mesh has 0 vertices. SAVED MESH [Open3D DEBUG] [ClusterConnectedTriangles] Compute triangle adjacency [Open3D DEBUG] [ClusterConnectedTriangles] Done computing triangle adjacency [Open3D DEBUG] [ClusterConnectedTriangles] Done clustering, #clusters=0 [Open3D WARNING] Write PLY failed: mesh has 0 vertices. SAVED CLEANED MESH
With or without the argument you provided earlier, I arrive at the same point. It's not a 360° video but rather ‘120-140°’.
The problem you have now is that the depth is being cropped to soon, so increasing the horizontal baseline and the truncation limit should solve it. It's better to first increase the baseline and then the truncation limit though.
To increase the baseline, add the --no-renderer_360_scene argument, which should calculate a better horizontal baseline given that your scene is not a 360 scene (the resulting baseline should be larger than the current baseline, which is 0.275). If that still doesn't work, increase the truncation limit with the argument --TSDF_max_depth_baselines to something larger than the default 20.
Also, note that you only have 3 views in your current model, whereas in the first message that you sent you had 86 views. 3 views for COLMAP and 3DGS is usually insufficient.
Also, note that you only have 3 views in your current model, whereas in the first message that you sent you had 86 views. 3 views for COLMAP and 3DGS is usually insufficient.
Hi again,
This seems to be related to the '--GS_save_test_iterations 30000' argument Without it, it finds 86 views.
I'll try with the last arguments and keep you posted.
Thank you very much for your help in any case 😊
Hello !
I have a problem because the program always ends up “killed”... It seems to be a memory problem but I don't understand why... It seems to use RAM but little or no VRAM... Even with 29 frames in 720p... I'm a bit stumped.
I have no problem with 3DGS and 200 images in 1600p :(...
Here the history for next code line "python run_single.py --colmap_name cascade --skip_video_extraction --no-renderer_scene_360 --TSDF_max_depth_baselines 30 --GS_save_test_iterations 30000" : Training progress: 100%|██████████████████████████████████████████| 30000/30000 [13:54<00:00, 35.95it/s, Loss=0.0260883]
[ITER 30000] Evaluating train: L1 0.018139631859958174 PSNR 29.62316818237305 [06/09 13:13:30]
[ITER 30000] Saving Gaussians [06/09 13:13:30]
Training complete. [06/09 13:13:47] num views: 29 baseline: 0.6796290173109062 Loading trained model at iteration 30000 Reading camera 29/29 Loading Training Cameras Loading Test Cameras 0%| | 0/29 [00:00<?, ?it/s]UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1716905971873/work/aten/src/ATen/native/TensorShape.cpp:3587.) UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. 100%|███████████████████████████████████████████████████████████████████████████████████| 29/29 [00:21<00:00, 1.32it/s] Automask must be enabled for masking in script mode. Skipping. 10%|████████▋ | 3/29 [01:52<18:30, 42.71s/it] Killed
See https://github.com/yanivw12/gs2mesh/issues/1
Your code gets killed in the TSDF step, and there seems to be some bug with the Open3D TSDF. If you're using CUDA11.8 and python3.8, the problem could be the Ubuntu version you're using (I tested it on 20.04, and you're using 22.04). More details in the "Common Issues and Tips" section. From what I've tested, with Ubuntu 20.04 and python 3.7 and 3.8 the TSDF runs without an issue.
Thank you for your help, I'll test with Ubuntu 20.04 :)
Last question, Is 32Gb of RAM enough to run your code? I have a 4090 next to it but I think that the RAM may be a bit borderline with gs2mesh?
Thank you again for your time and advice 😊
The GS and Stereo models should run fine on a 4090 (with 24GB of VRAM). The only step that can take up RAM is the TSDF, since it's not a GPU accelerated version(*). I haven't really tested how much RAM it takes, but I assume 32GB is enough.
(*) As a side note, I also tested a GPU accelerated TSDF (https://github.com/andyzeng/tsdf-fusion-python), but from my experience with it, it's way more heavier on the GPU memory and I often encountered "CUDA out of memory" (with an L40 with 50GB), especially for larger scenes or higher resolutions. The results also looked visually better with the Open3D version.
Hello Yaniv,
Thanks again for your help.
I have installed ubuntu 20.04 with everything that goes well but I am stuck because diff_rasterization_gaussian needs glibc 2.32 and ubuntu 20.04 stops at 2.31.. I've tried everything to move to version 2.32 but I'm at a dead end.
How did you circumvent the problem please?
(gs2mesh) xxxxxxx@Jeremie:~/gs2mesh$ python run_single.py --colmap_name cascade --skip_video_extraction --no-renderer_scene_360 --TSDF_max_depth_baselines 30
Traceback (most recent call last):
File "run_single.py", line 14, in
I checked on the server I'm using, and I'm also running glibc 2.31... Maybe clean the previous installation (the entire diff_gaussian_rasterization folder) and re-install?
Same problem with WSL Ubuntu 20.04 :
Optimizing /home/xsevenx/gs2mesh/splatting_output/custom_nw_iterations30000/cascade Output folder: /home/xsevenx/gs2mesh/splatting_output/custom_nw_iterations30000/cascade [11/09 14:16:53] Reading camera 29/29 [11/09 14:16:53] Converting point3d.bin to .ply, will happen only the first time you open the scene. [11/09 14:16:53] Loading Training Cameras [11/09 14:16:53] Loading Test Cameras [11/09 14:16:54] Number of points at initialisation : 17352 [11/09 14:16:54] Training progress: 100%|██████████████████████████████████████████| 30000/30000 [14:48<00:00, 33.75it/s, Loss=0.0265618]
[ITER 30000] Evaluating train: L1 0.018692961521446706 PSNR 29.656937789916995 [11/09 14:31:48]
[ITER 30000] Saving Gaussians [11/09 14:31:49]
Training complete. [11/09 14:32:06] num views: 29 baseline: 0.6802225502353557 Loading trained model at iteration 30000 Reading camera 29/29 Loading Training Cameras Loading Test Cameras 0%| | 0/29 [00:00<?, ?it/s]UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1716905971873/work/aten/src/ATen/native/TensorShape.cpp:3587.) UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details. 100%|███████████████████████████████████████████████████████████████████████████████████| 29/29 [00:25<00:00, 1.13it/s] Automask must be enabled for masking in script mode. Skipping. 14%|███████████▌ | 4/29 [01:57<14:14, 34.17s/it]
It's using swap like something about 140Gb... Same 29 files 720p (picture weight 500Ko average) with this command : python run_single.py --colmap_name cascade --skip_video_extraction --no-renderer_scene_360 --TSDF_max_depth_baselines 30 --GS_save_test_iterations 30000
Try using a lower resolution for the grid (higher TSDF_voxel). 34 seconds per iteration of TSDF is way too much (should take a couple of seconds max). I suggest visualizing the depth maps that TSDF is using with the custom_data.ipynb notebook. It might help understand why.
Really nice, I've tried with --TSDF_voxel 4 instead of 2 and results is great :) Cleanning needs best settings but ply is clean.
Thank you very much for your help :)
I'll try with ubuntu 24.04 in order to see if all is Ok =)
Looks great! It's nice seeing people using the method on their own data. That was one of the things I was anticipating the most when releasing the code.
I'm closing the issue for now, feel free to re-open if necessary.
Hello !
First of all, thank you for your work and for keeping up with the updates :)
I've just run a test on a 30sec mp4 video with the following code: python run_single.py --colmap_name cascade --video_extension mp4
The calculation finishes well but a problem occurs when exporting the 30k ply (The 7k comes out fine and is usable): Optimizing /home/xxxxxxx/gs2mesh/splatting_output/custom_nw_iterations30000/cascade Output folder: /home/xxxxxxx/gs2mesh/splatting_output/custom_nw_iterations30000/cascade [04/09 08:44:34] Reading camera 86/86 [04/09 08:44:35] Converting point3d.bin to .ply, will happen only the first time you open the scene. [04/09 08:44:35] Loading Training Cameras [04/09 08:44:35] [ INFO ] Encountered quite large input images (>1.6K pixels width), rescaling to 1.6K. If this is not desired, please explicitly specify '--resolution/-r' as 1 [04/09 08:44:35] Loading Test Cameras [04/09 08:45:11] Number of points at initialisation : 43182 [04/09 08:45:11] Training progress: 23%|█████████▊ | 7000/30000 [09:00<35:06, 10.92it/s, Loss=0.1461526] [ITER 7000] Evaluating train: L1 0.07914816588163376 PSNR 18.642070770263672 [04/09 08:54:13]
[ITER 7000] Saving Gaussians [04/09 08:54:14] Training progress: 100%|█████████████████████████████████████████| 30000/30000 [54:02<00:00, 9.25it/s, Loss=0.1150575]
[ITER 30000] Evaluating train: L1 0.06295853853225708 PSNR 20.098556518554688 [04/09 09:39:23]
[ITER 30000] Saving Gaussians [04/09 09:39:24] Killed num views: 86 baseline: 0.2705555039768162 RPly: Unable to open file [Open3D WARNING] Read PLY failed: unable to open file: /home/xxxxxxx/gs2mesh/splatting_output/custom_nw_iterations30000/cascade/point_cloud/iteration_30000/point_cloud.ply Loading trained model at iteration 30000 Reading camera 86/86 Loading Training Cameras Loading Test Cameras Traceback (most recent call last): File "run_single.py", line 190, in
run_single(args)
File "run_single.py", line 90, in run_single
renderer.prepare_renderer()
File "/home/xxxxxxx/gs2mesh/gs2mesh_utils/renderer_utils.py", line 356, in prepare_renderer
scene = Scene(dataset, self.gaussians, load_iteration=self.splatting_iteration, shuffle=False)
File "/home/xxxxxxx/gs2mesh/third_party/gaussian-splatting/scene/init.py", line 78, in init
self.gaussians.load_ply(os.path.join(self.model_path,
File "/home/xxxxxxx/gs2mesh/third_party/gaussian-splatting/scene/gaussian_model.py", line 216, in load_ply
plydata = PlyData.read(path)
File "/home/xxxxxxx/anaconda3/envs/gs2mesh/lib/python3.8/site-packages/plyfile.py", line 158, in read
(must_close, stream) = _open_stream(stream, 'read')
File "/home/xxxxxxx/anaconda3/envs/gs2mesh/lib/python3.8/site-packages/plyfile.py", line 1345, in _open_stream
return (True, open(stream, read_or_write[0] + 'b'))
FileNotFoundError: [Errno 2] No such file or directory: '/home/xxxxxxx/gs2mesh/splatting_output/custom_nw_iterations30000/cascade/point_cloud/iteration_30000/point_cloud.ply'
Have you already encountered this problem? If so, do you have any advice on how to solve it? For information, I'm running Windows 11, CUDA 11.8, WSL2/Ubuntu 22.04
Thanks for your help :)
Have a nice day ^_^