Closed davidgxue closed 1 month ago
@davidgxue Appears to be caused by threadpoolctl pkg trying to open/import openblas.
Please show the following:
Will close this issue. If still a bug or reproducible, please re-open.
Yeah sorry about the late response. I can't seem to reproduce it again. But even with the error above, the model seems to have quantized fine, so not sure if it's worth investigating. Will make a note if I run into this issue again. Thanks!
Describe the bug
Quantization seems to have finished. So maybe this error can be ignore, but posting just to be safe so maybe folks here can verfiy.
GPU Info 1x A100 40GB
Software Info
Operation System/Version + Python Version
If you are reporting an inference bug of a post-quantized model, please post the content of
config.json
andquantize_config.json
.To Reproduce
Not sure if it's possible to consistently reproduce... I quantized a model merged with lora adapters. Specifically, it's a phi-3 mini 4k instruct finetuned with unsloth. I think you may want to consider this a mistral model instead since unsloth's phi-3 is mistralfied (in other words they changed the architecture of phi3 weights to be mistral based by splitting up the QKV layers).
Maybe attempt to quantize it with
unsloth/Phi-3-mini-4k-instruct
from huggingface, it should be the same result I imagine.When quantizing... I got the following logs nears the end of quantization
Expected behavior
No Errors
Model/Datasets
Make sure your model/dataset is downloadable (on HF for example) so we can reproduce your issue.