intel / intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

Apache License 2.0

1.58k stars 244 forks source link

AttributeError: module 'intel_extension_for_pytorch.quantization' has no attribute 'WoqActQuantMode' #456

Open azhuvath opened 11 months ago

azhuvath commented 11 months ago

Describe the issue

Getting the error while trying to run the below command.

Step 2: Generate quantized model with INT4 weights

Provide checkpoint file name by --low-precision-checkpoint

python single_instance/run_llama_quantization.py --ipex-weight-only-quantization --output-dir "saved_results" --int8-bf16-mixed -m meta-llama/Llama-2-7b-chat-hf --low-precision-checkpoint "saved_results/gptq_checkpoint.pt"

Do I need to source install IPEX as opposed to pip install?

azhuvath commented 11 months ago

Traceback (most recent call last): File "/home/sdp/llama2/intel-extension-for-pytorch/examples/cpu/inference/python/llm/single_instance/run_llama_quantization.py", line 308, in "PER_TENSOR": ipex.quantization.WoqActQuantMode.PER_TENSOR, AttributeError: module 'intel_extension_for_pytorch.quantization' has no attribute 'WoqActQuantMode'

jingxu10 commented 10 months ago

Please use the latest code. We will release next minor version with WOQ soon.