ml-explore / mlx-examples

Examples in the MLX framework
MIT License
6.24k stars 885 forks source link

ValueError: [dequantize] The matrix should be given as a uint32 #1111

Open chaihahaha opened 5 hours ago

chaihahaha commented 5 hours ago

After quantizing mlx-community/miqumaid-v3-70b with this command mlx_lm.convert --hf-path miqumaid-v3-70b --mlx-path miqumaid-v3-70b-4bit -q --qbits 4.

The model miqumaid-v3-70b-4bit cannot be inferred with mlx_lm.server. ValueError: [dequantize] The matrix should be given as a uint32 will show up.

angeloskath commented 4 hours ago

I just downloaded NeverSleep/MiquMaid-v3-70B and quantized it and it seems to run fine. Perhaps there is something wrong with the unquantized download?

chaihahaha commented 1 hour ago

I just downloaded NeverSleep/MiquMaid-v3-70B and quantized it and it seems to run fine. Perhaps there is something wrong with the unquantized download?

Thanks for testing. But NeverSleep/MiquMaid-v3-70B is not the model I downloaded, I downloaded the mlx converted (unquantized) mlx-community/MiquMaid-v3-70B and tried to quantize it, but unsuccessful.

angeloskath commented 23 minutes ago

I see, did you dequantize first? Even though it is not in the name, that model is actually quantized at 8 bits.