microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
https://www.deepspeed.ai/
Apache License 2.0
35.32k stars 4.09k forks source link

Add support of L-BFGS optimizer #2935

Open wingrime opened 1 year ago

wingrime commented 1 year ago

L-BFGS - Is second order optimizer having good convergence but is memory intensive. Offloading optimizer state with deep-speed would improve method accessibility for users constrained in GPU-RAM

tjruwase commented 1 year ago

@wingrime, thanks for this suggestion. Are you able to provide a PR?