Mini-Sequence Transformer integration

unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory

https://unsloth.ai

Apache License 2.0

18.42k stars 1.29k forks source link

Open Trapper4888 opened 1 month ago

Trapper4888 commented 1 month ago

Hello,

A new paper has been published that presents a method for better handling long sequences during fine-tuning: https://wdlctc.github.io/mst.html. The authors have also integrated their code into an unsloth fork: https://github.com/wdlctc/unsloth.

I believe this would be of interest to you, @danielhanchen. If it works as intended, it could be a valuable addition to unsloth.

Otherwise, I will close this issue.

danielhanchen commented 1 month ago

Oh yes saw the PR - thanks for your contribution!