Closed KohakuBlueleaf closed 1 year ago
The TAESD encoder is just trained with MSE (against the SD encoder results), and the TAESD decoder is trained as a conditional GAN (conditioned on the SD encoder results).
Regarding how to train conditional GANs, you could look at the GAN training code used in SD https://github.com/CompVis/latent-diffusion#training-autoencoder-models
As I know the taming transformer training code has both LPIPS loss + mse loss + discriminator loss do you also use this three?
And thx for this information!
I only used discriminator and tiny bit of MSE (didn't try LPIPS)
I only used discriminator and tiny bit of MSE (didn't try LPIPS)
Ok! Thx!
As part of TAESD 1.1, I added LPIPS (though still at lower weight than adversarial). So it's overall not very different from the original VQGAN approach.
Is it possible to opensource the training code or the loss equation? I'm considering to train different scale TAESD (like 1M 2.5M(this) 10M) to check if we can get a series of models that fit for different usage.