lucidrains / imagen-pytorch

Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
MIT License
8k stars 757 forks source link

Loss flattens out after 50k training steps #324

Open alif-munim opened 1 year ago

alif-munim commented 1 year ago

I'm currently training Imagen on two similar medical video datasets (one with over 10,000 videos and the other with around 500) to generate 32x32 videos. I've noticed that both models start out with a very high average loss of 1000 or more, which then begins to flatten out around 50.

Below are two of my ongoing experiments: https://wandb.ai/alif-munim/imagen-echonet https://wandb.ai/alif-munim/imagen-uhn

I previously ran similar experiments for over 100k steps but noticed very similar results. I would love to hear from anyone who's successfully trained text-to-video. Is this the expected behavior? How many steps does it typically take until Imagen can generate reasonable videos?

alif-munim commented 1 year ago

Just updating the thread with my progress so far. My longest run has been for a little over 150k training steps now, but both the average loss and validation loss continue to hover around 20.

Link to WandB: https://wandb.ai/alif-munim/imagen-echonet

image image