I'm training on the CC3M. Is there anything wrong with my training? The loss seems to be going down way too fast and despite the low training loss values, sampling doesn't seem to show it is working. Sampling during training by just calling decoder.sample() giving it the CLIP image embeddings of the minibatch training images. Since I'm training a decoder with two Unets and just training the first Unet for now, I'm breaking out after sampling from the first Unet.
Theses are the samples at the 0k, 5k, 13k, 16k, and 17k training steps.
I'm training on the CC3M. Is there anything wrong with my training? The loss seems to be going down way too fast and despite the low training loss values, sampling doesn't seem to show it is working. Sampling during training by just calling decoder.sample() giving it the CLIP image embeddings of the minibatch training images. Since I'm training a decoder with two Unets and just training the first Unet for now, I'm breaking out after sampling from the first Unet.
Theses are the samples at the 0k, 5k, 13k, 16k, and 17k training steps.