Open bratao opened 11 months ago
If this is a major request by the OSS community - I'm more than happy to include some of the changes from Unsloth!
I would like to second this request. As I understand it, this is simply free increased efficiency for training with no degradation on accuracy, right? I think this would be a major boost to the Axolotl project.
Yes 0% loss in accuracy - we do actual FLOP reductions via our manual autograd engine. I'm actually working with @casper-hansen and some other Axolotl people to put some methods inside Axolotl!
Legend. Superman has posters of you on his wall. Thanks so much for all of your work!
:)
I tried a few of the optimizations for FFT on Mistral, but I cannot seem to improve it according to the posts. @danielhanchen would be great if you can pitch in with a PR if you have time.
https://github.com/OpenAccess-AI-Collective/axolotl/tree/unsloth_modules
@casper-hansen Oh cool - I'll have a look! Ye I'll try to make a PR to axolotl!!
Hi, Is there any status on these updates? If I use Axolotl right now, will I benefit from the Unsloth improvements? Thank you!
@fakerybakery Sorry not yet - I'll take a look at the PR Casper made, but it might take some time
Ok, thank you!
Unsloth is particular interesting if your GPU is not supported by flash attention (e.g., V100). Unfortunately, as of now, unsloth seems to not have multi-GPU support in the OSS version yet: https://github.com/unslothai/unsloth/issues/107
FYI: gradient checkpointing has been merged: https://github.com/OpenAccess-AI-Collective/axolotl/pull/1528 π
β οΈ Please check that this feature request hasn't been suggested before.
π Feature description
The project https://github.com/unslothai/unsloth looks very interesting. He claims great speedups for finetuning. He detail the improvements here:
βοΈ Solution
Would be nice to use their kernels to speedup axolotl
β Alternatives
No response
π Additional Context
No response
Acknowledgements