Open Warvito opened 1 year ago
In the original code from taming transformers and latent diffusion models, the weight for the discriminator adversarial loss is defined by the ratio between the nll_loss and the g_loss gradients (https://github.com/CompVis/taming-transformers/blob/3ba01b241669f5ade541ce990f7650a3b8f65318/taming/modules/losses/vqperceptual.py#L63).
In our experiments and the paper on 3D brain images (https://arxiv.org/pdf/2209.07162.pdf), we were not able to make it work well on both, 2D and 3D data. However, this might be useful for others.
We have the code for it in the KCL's VQ-VAE codebase
In the original code from taming transformers and latent diffusion models, the weight for the discriminator adversarial loss is defined by the ratio between the nll_loss and the g_loss gradients (https://github.com/CompVis/taming-transformers/blob/3ba01b241669f5ade541ce990f7650a3b8f65318/taming/modules/losses/vqperceptual.py#L63).
In our experiments and the paper on 3D brain images (https://arxiv.org/pdf/2209.07162.pdf), we were not able to make it work well on both, 2D and 3D data. However, this might be useful for others.