Closed rom1504 closed 2 years ago
We only have fixed LR, not great
Stable diffusion use cosine schedule https://github.com/CompVis/stable-diffusion/blob/ce05de28194041e030ccfc70c635fe3707cdfc30/ldm/lr_scheduler.py#L4
Openclip too https://github.com/mlfoundations/open_clip/blob/c933765dc557d88e15be968e78d7580d95f86af8/src/training/scheduler.py
Varying LR has large impact on training
What did OpenAI dalle2 use ?
@rom1504 i think the use of cosine schedule is more rooted in tradition than evidence based, but if you can show me a paper that shows it makes a big difference, would definitely update my beliefs
i've added it here, diffusion prior as well
We only have fixed LR, not great
Stable diffusion use cosine schedule https://github.com/CompVis/stable-diffusion/blob/ce05de28194041e030ccfc70c635fe3707cdfc30/ldm/lr_scheduler.py#L4
Openclip too https://github.com/mlfoundations/open_clip/blob/c933765dc557d88e15be968e78d7580d95f86af8/src/training/scheduler.py
Varying LR has large impact on training
What did OpenAI dalle2 use ?