Open chimezie opened 3 weeks ago
If the demo.py
generation works for your quantized model, I expect that the next release of LM Studio will resolve this issue. Expect a new release here soon https://lmstudio.ai/beta-releases
Excellent. Looking forward to it
I'm running LM Studio verison 0.3.5. on an Apple M1 Ultra with 32 GB of memory.
As mentioned in Discord, I have a model of my own finetuned from Nemo that was quantized (after fusing the LoRA adapter) using mlx-lm, ensuring I was using v4.45.2 of transformers in the command line. I was able to generate from it in the command line using demo.py of mlx-engine (after a few changes to sync it with caching updates to mlx: https://github.com/lmstudio-ai/mlx-engine/pull/20 )
However, when I try to load the quantized model, I get the following error: