unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.37k stars 1.28k forks source link

Can you add support for apple/ml-cross-entropy? #1298

Open zfflxx opened 5 days ago

zfflxx commented 5 days ago

This new method saves a lot of more memory, can you port it to unsloth? Cut Your Losses in Large-Vocabulary Language Models https://github.com/apple/ml-cross-entropy

iiLaurens commented 5 days ago

I was coming here to ask the same. This seems like a perfect fit for Unsloth

danielhanchen commented 4 days ago

Yes! Just tested it and it seems to work really well :)

danielhanchen commented 4 days ago

Will add it probably in the next release :)

dame-cell commented 3 days ago

hey there @danielhanchen I've been trying to implement this in triton if it's ok and can i open a draft by tomorrow just so we can discuss and check if the code is all right

danielhanchen commented 3 days ago

@dame-cell Oh I already managed to add Apple's one in :)

shimmyshimmer commented 3 days ago

We're still testing though so it's not final or near finished.