unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.73k stars 1.32k forks source link

Support for FP8 quantization #776

Open rwl4 opened 4 months ago

rwl4 commented 4 months ago

With the release of the new Mistral NeMo 12B model we now have weights that were pre-trained with FP8. It would be great if Unsloth could support 8bit as well as the existing 4bit training so we could do training without any quantization related loss.

RaccoonOnion commented 4 months ago

Second here. It would be great if Unsloth can release a 4bit quantized NeMo model.

danielhanchen commented 4 months ago

Apologies on the delay just did! https://huggingface.co/unsloth/Mistral-Nemo-Base-2407-bnb-4bit