AnswerDotAI / bert24

Apache License 2.0
25 stars 3 forks source link

Add warmup stable decay lr schedule #31

Closed ohallstrom closed 1 month ago

ohallstrom commented 1 month ago

Adding option to use the WSD scheduler from https://arxiv.org/abs/2404.06395

NohTow commented 1 month ago

LGTM I am not sure we'll use WSD for fine-tuning (GLUE/sequence classification) but I am being nitpicky