ollama / ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
https://ollama.com
MIT License
100.67k stars 8.03k forks source link

Does having the default quant type being Q4_0 (a legacy format) on the model hub still make sense? #5425

Open sammcj opened 5 months ago

sammcj commented 5 months ago

The Ollama model hub still has the default quant type of Q4_0 which is a legacy format that under-performs compared to K-quants (Qn_K, e.g. Q4_K_M, Q6_K, Q5_K_L etc...).

Reference

image

image

image

image

(Sorry if an issue already exists for this - if it did my search-foo let me down)

DuckyBlender commented 4 months ago

I 100% agree on this. This decision should have been made a long time ago. The default on all of my models on Ollama is q4_K_M for this reason

mahenning commented 2 months ago

Any updates on this? Would be great if the k-quants will be handled as defaults, as I personally see no reason for the q_0 quants to remain default. It's more typing to get the k-quants right now, and users with less experience in quantization miss out on an arguably better model if they just use the default model names. If the decision went against k-quants as default, I'd be interested in the reason.