sherwinbahmani / cc3d

CC3D: Layout-Conditioned Generation of Compositional 3D Scenes
https://sherwinbahmani.github.io/cc3d
92 stars 3 forks source link

Problems on training device requirement and training convergence #10

Closed Mu-Yanchen closed 5 months ago

Mu-Yanchen commented 5 months ago

Hello, thanks for your great work and this open source repository!

Your paper mentions that "Our model takes around 2 days to converge using 4 NVIDIA V100 with 32 GB memory ", whether both datasets 3d_front and kitti converge under such conditions?

In addition, how do you judge the training convergence, whether the done parameter in the code is True, or some other convergence judgment condition?

Looking forward to your reply!

sherwinbahmani commented 5 months ago

Hi,

kitti converged a bit faster than 3dfront, but 2 days was a good trade-off amount of training time. I mainly was judging based on visual quality on samples generated during training to judge convergence. You could use FID as well if needed and you are optimizing for beating sota.

Mu-Yanchen commented 5 months ago

Thansk for the instructive advice! BTW, I notice that done = (cur_nimg >= total_kimg * 1000), and the total_kimg is set to 25000 by default which is a quite huge number in my opinion. I want to know in your experiments, which cur_tick/cur_nimg model's results do you use(namely which network-snapshot-xxxxxx.pkl do you use for the result)? If you can remember the quantitative specific data. I think I can reproduce the results of the paper more accurately!🫡

sherwinbahmani commented 5 months ago

Yeah we don‘t train for the full 25000 as other 3D GANs but rather for 3000-5000. After 5000 the quality converges and the gains get pretty small.

Mu-Yanchen commented 5 months ago

These experiences are very valuable to me! Thank you again for your prompt reply!