joh-fischer / PlantLDM

A latent diffusion model for visual synthesis of plant images.
MIT License
15 stars 1 forks source link

VQGAN reconstruction quality #47

Open RoboticsZhang opened 10 months ago

RoboticsZhang commented 10 months ago

Hello, thank you for your open-source work, I have gained a lot from it. When I try to train VQGAN from scratch to reconstruct plant300K data, I find that the reconstructed images are quite blurry, shown as below: 2023-12-19_00-38 I didn't change the default hyperparameters (and the default model is VQGANLight, I didn't change it as well), and the loss curves are a bit weird, shown as below: image

With this VQGAN pretrained model, I failed to train a good DDPM model. I wonder that if you can reach a better VQGAN result, and what is the corresponding hyperparameters?

RoboticsZhang commented 10 months ago

This is the DDPM trainning results with the above pretrained VQGAN: image I also trained it with the default hyperparameters.