Closed NiriProject closed 1 month ago
This happens if a matrix in the model sees no calibration data at all during the entire reference forward pass. I.e. all of the 32k tokens in the measurement dataset were routed around one or more specific experts, which is incredibly unlikely if the routing layers are working as they should.
Sadly it's a bit hard to say more than that without knowing how the model was put together.
https://huggingface.co/Undi95/Plap-8x13B I tried to quant the model above, and no matter what settings I put it always fails with the following error;
While I know supporting every odd model out there isn't feasible. I think the MoE future we're barreling towards needs more flexible support. Seems this will be the current thing for quite a while.