Closed yzlnew closed 2 months ago
Hi @yzlnew
I ran your script on main and it ran fine - with finetuning_stage.
Couple questions:
"Qwen__Qwen1.5-0.5B-Chat"
the same as Qwen/Qwen1.5-0.5B-Chat
?Thanks
@horheynm 1. Yes, I downloaded the model locally. 2. I installed llm-compressor
from source, but not the latest from master. Maybe I should try the latest code?
@horheynm 1. Yes, I downloaded the model locally. 2. I installed
llm-compressor
from source, but not the latest from master. Maybe I should try the latest code?
Can you try again with the release? pip install llmcompressor
I'm running behind proxy so I have to modify the dataset load logic to run. But I can confirm the latest master is able to perform all three stages with the first two stored in dense format.
Describe the bug
Performing https://github.com/vllm-project/llm-compressor/tree/main/examples/quantization_24_sparse_w4a16
Recipe
Error on the final quantization stage
If I remove the finetune stage, I can generate a model with
marlin24
format.