weijiawu / ParaDiffusion

Official code for 'Paragraph-to-Image Generation with Information-Enriched Diffusion Model'
94 stars 2 forks source link

'nan' loss #5

Open tanshuai0219 opened 6 months ago

tanshuai0219 commented 6 months ago

When I train stage 1, where the llama2 is frozen and linear layer+unet are trainable, I get "nan" loss after few hundreds steps. Could u give me some advices?

weijiawu commented 6 months ago

Sorry, we haven't encountered the corresponding issue during our training process. Perhaps you can try using CLIP or another text encoder for a few train iterations to exclude whether it's caused by the text encoder.

tanshuai0219 commented 6 months ago

Sorry, we haven't encountered the corresponding issue during our training process. Perhaps you can try using CLIP or another text encoder for a few train iterations to exclude whether it's caused by the text encoder.

I'll have a try, thanks for your reply~