Closed ohallstrom closed 1 month ago
Adding option to use the WSD scheduler from https://arxiv.org/abs/2404.06395
LGTM I am not sure we'll use WSD for fine-tuning (GLUE/sequence classification) but I am being nitpicky
Adding option to use the WSD scheduler from https://arxiv.org/abs/2404.06395