Open azhuvath opened 11 months ago
Traceback (most recent call last):
File "/home/sdp/llama2/intel-extension-for-pytorch/examples/cpu/inference/python/llm/single_instance/run_llama_quantization.py", line 308, in
Please use the latest code. We will release next minor version with WOQ soon.
Describe the issue
Getting the error while trying to run the below command.
Step 2: Generate quantized model with INT4 weights
Provide checkpoint file name by --low-precision-checkpoint
python single_instance/run_llama_quantization.py --ipex-weight-only-quantization --output-dir "saved_results" --int8-bf16-mixed -m meta-llama/Llama-2-7b-chat-hf --low-precision-checkpoint "saved_results/gptq_checkpoint.pt"
Do I need to source install IPEX as opposed to pip install?