Open zfflxx opened 5 days ago
I was coming here to ask the same. This seems like a perfect fit for Unsloth
Yes! Just tested it and it seems to work really well :)
Will add it probably in the next release :)
hey there @danielhanchen I've been trying to implement this in triton if it's ok and can i open a draft by tomorrow just so we can discuss and check if the code is all right
@dame-cell Oh I already managed to add Apple's one in :)
We're still testing though so it's not final or near finished.
This new method saves a lot of more memory, can you port it to unsloth? Cut Your Losses in Large-Vocabulary Language Models https://github.com/apple/ml-cross-entropy