nv-tlabs / LION

Latent Point Diffusion Models for 3D Shape Generation
Other
740 stars 58 forks source link

cfg.yml for train diffusion prior #21

Closed yjcaimeow closed 1 year ago

yjcaimeow commented 1 year ago

Hi @ZENGXH ,

I successfully completed the first stage hvae training, but there is no cfg file when training the second diffusion prior. 1) Could you please provide the diffusion prior cfg file? 2) Is it okay to use the first stage vae cfg.yml for the second diffusion prior training process? I have tried, but the beta shape maybe wrong. image

Best regards, Yingjie

ZENGXH commented 1 year ago

Hi Yingjie, are you using this file to launch prior training? The cfg.yml for prior is different from the first-stage, but we will load the first-stage cfg.yml through --config (as in the train_prior.sh)

Alternatively, I upload the prior config under the config folder now. Note that the sde.vae_checkpoint need to change to your stage-1 vae path.

yjcaimeow commented 1 year ago

Hi ZENGXH,

Thank you for your quick and patient reply.

Yes, I am training the second stage use "script/train_prior.sh". And I set the --config with the cfg.yml from first-stage vae folder. I understand you mean this experiment is okay, huh? If correct, I will keep training. By the way, is it convenient for you to share a loss curve of the second stage of training? The loss on my side is kept at around 1.9.

Thanks for the uploaded prior config and I will use it following experiments.

Best regards, Yingjie

ZENGXH commented 1 year ago

Re 'experiment is okay': How do you fix the beta shape error? Is it caused by wrong config?

Re loss curve: this is my curve for stage-2 training (on car): image also the visualization of the reconstruction: image visualization of samples at different steps (top is output points, bottom is latent poitns): image image

yjcaimeow commented 1 year ago

Hi ZENGXH,

Thank you for your quick and patient reply. The beta from the first-stage cfg is wrong. And I retrain the second stage with the cfg you uploaded recently. The curve and final results are right.

Thanks again.

Best, Yingjie

JohanYe commented 1 year ago

@ZENGXH I would like to second that the cfg generated by the train_vae.sh script does not work. I am getting the same beta issue as described at the top by @yjcaimeow.