About the training epoch of VAE model and uni-model for text to face

ziqihuangg / Collaborative-Diffusion

[CVPR 2023] Collaborative Diffusion

https://ziqihuangg.github.io/projects/collaborative-diffusion.html

Other

405 stars 31 forks source link

About the training epoch of VAE model and uni-model for text to face #23

Closed ourpubliccodes closed 4 months ago

ourpubliccodes commented 1 year ago

Hello! Based on the instructions you provided, I am trying to retrain the VAE model and uni-model for text to face on RTX3090, may I ask what is the epoch for training these two models respectively? Or are you judging whether to end the model training process based on the visualization results of reconstructions_gs-xxxxxx_e-xxxxxx_b-xxxxxxx.png and samples_gs-xxxxxx_e-xxxxxx_b-xxxxxxx.png? Looking forward to your answer.

ziqihuangg commented 11 months ago

Hi, for VAE, usually training 50-150 epochs give satisfactory checkpoints. You can observe the reconstruction results and the reconstruction loss. For Uni-Modal diffusion models, usually takes 100-200 epochs.

jupytera commented 4 months ago

Hello, may I ask if a mask file is also required for training a text to face single diffusion model. I trained the text to face single diffusion model on a new dataset without providing a mask file, and found that the training output only improved the image within the default square area

ziqihuangg commented 4 months ago

Hi, if you are referring to the text-to-image model, then no mask is needed.