foundation-model-stack / fms-hf-tuning

🚀 Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.
Apache License 2.0
28 stars 48 forks source link

build(deps): set transformers below 4.46, waiting on fixes #384

Closed anhuong closed 3 weeks ago

anhuong commented 3 weeks ago

Description of the change

The triton fast_kernels are broken in transformers v4.46 due to the changes and the way the patching is done in fms-acceleration described here. In addition, the gradient_accumulation is deteriorated in v4.46 which is fixed in transformers PR which has been merged but not released yet.

In addition, upgrading transformers broke unit tests and is being resolved/discussion in #383.

Thus we will set transformers below v4.46 while we wait for these fixes to go in.

Related issue number

How to verify the PR

Was the PR tested

github-actions[bot] commented 3 weeks ago

Thanks for making a pull request! 😃 One of the maintainers will review and advise on the next steps.