kangpeilun / VastGaussian

This is an unofficial Implementation
Apache License 2.0
327 stars 26 forks source link

I like your work very much, I have tilted photography photos on my side and if I train with the original resolution, the training doesn't go on. My setup is RTX4090 and I've tried CPU loading data as well. Do you have any thoughts on this please? #13

Closed chenqi13814529300 closed 3 months ago

kangpeilun commented 3 months ago

First you need to determine if the format of your data set is consistent with the format required by 3dgs. Secondly are you using the latest master branch code? Finally, how big is your VRAM? I have modified the code to read images into memory only when the image is trained. Generally speaking, using the original resolution should not be the reason why the code cannot run, on the contrary, the number of point clouds generated by gs during training will increase dramatically, and 24GB of video memory may not be able to fully train all iteration.

I was using rubble for testing, and my configuration was 32G RAM, 12GVRAM. Training rubble, each partition trains 10_000 iteration, using 4 times downsampling.

chenqi13814529300 commented 3 months ago

The way I phrased it is not correct, I used your latest code from your master branch, the parallel code works but without downsampling, the iteration is so slow. The number of point clouds generated by gs during training increases dramatically, and 24GB of video memory may not be sufficient to fully train all iterations. This is something I agree with. I have 24G of video memory, and how to train a Gaussian model in HD using 24G of video memory is a difficult task. I read your code I think you are a proficient Gaussian engineer and I admire you.

kangpeilun commented 3 months ago

The master branch has not uploaded the function of training different partitions with multiple cards in parallel, but the develop branch has submitted a preliminary implementation, which may have some bugs. For some reasons, I cannot test the code for parallel training of multiple cards at present. Slow training is inevitable because I adjusted the code to only load the camera once during training, and read the image data from the hard disk and perform downsampling operations. Unlike the original gs, all images are read into RAM at once. For large scenes, reading all the image data at once will cause RAM overflow.

If you only have 24G of VARM and want to train vast, you might consider adjusting vastgs' hyperparameters: arguments/parameters.py

parser.add_argument("--densify_from_iter", type=int, default=500) 
parser.add_argument("--densification_interval", type=float, default=100)

densify_from_iter indicates the number of iterations from which densification starts, and densification_interval indicates how many iterations the densification is performed. As we all know, densification will lead to a sharp increase in the number of point clouds, so you can consider increasing the number of iterations of the starting point cloud and increasing the interval of densification, for example:

densify_from_iter=10000 densification_interval=500

I'm just giving you a reference. I haven't been tested.

chenqi13814529300 commented 3 months ago

Thank you very much for your kind reply!