lucidrains / DALLE2-pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
MIT License
11.12k stars 1.09k forks source link

LR schedule #228

Closed rom1504 closed 2 years ago

rom1504 commented 2 years ago

We only have fixed LR, not great

Stable diffusion use cosine schedule https://github.com/CompVis/stable-diffusion/blob/ce05de28194041e030ccfc70c635fe3707cdfc30/ldm/lr_scheduler.py#L4

Openclip too https://github.com/mlfoundations/open_clip/blob/c933765dc557d88e15be968e78d7580d95f86af8/src/training/scheduler.py

Varying LR has large impact on training

What did OpenAI dalle2 use ?

lucidrains commented 2 years ago

@rom1504 i think the use of cosine schedule is more rooted in tradition than evidence based, but if you can show me a paper that shows it makes a big difference, would definitely update my beliefs

i've added it here, diffusion prior as well