unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.37k stars 1.28k forks source link

what was the quantisation algorithm used in unsloth/Llama-3.2-1B-bnb-4bit? #1310

Open jayakommuru opened 1 day ago

jayakommuru commented 1 day ago

what was the quantisation algorithm used in unsloth/Llama-3.2-1B-bnb-4bit model: https://huggingface.co/docs/transformers/main/en/quantization/overview. Is it int4_awq or int4_weightonly ?

danielhanchen commented 7 hours ago

We use bitsandbytes!