Closed schmorp closed 6 months ago
A similar model fails with another assertion in IQ3_XXS after sucessfully quantizing IQ2_M:
[ 1/1925] blk.0.ffn_up.2.weight - [ 5120, 13824, 1, 1], type = f16, converting to iq3_xxs .. Oops: found point 104 not on grid: 104 0 0 0 GGML_ASSERT: ggml-quants.c:11118: false
Might or might not be a similar issue.
Model: https://huggingface.co/Undi95/BigPlap-8x20B imatrix file: https://huggingface.co/mradermacher/BigPlap-8x20B-i1-GGUF/blob/main/imatrix.dat
This issue was closed because it has been inactive for 14 days since being marked as stale.
When quantizing https://huggingface.co/Undi95/Plap-8x13B with this imatrix http://data.plan9.de/Plap-8x13B.imatrix quantize crashes with many messages as in the title (probably one per thread).
quantize can quantize all the Q quants and IQ2_M successfully, then crashes with this on IQ1_S.
This is with b2409. It differs from the nan-issue I reported earlier in that no nans were output during imatrix generation, and it seems to be IQ1_S specific.