Open Lihengwannafly opened 1 year ago
desc_act needs to be true, then it works as expected
flozi00 can you be more specific, how to set desc_act to be true? thanks.
OK, I have found the answer by myself, after grep the source coes. It is a parameter in the example quantization program of e.g. quant_with_alpaca.py, and finally be a parameter in BaseQuantizeConfig(desc_act=True)
Yes, it works as expected. The only side effect is a bit slow in loading the big models. But that is tolerable. Thanks for the hit, it saves me huge effort.
Describe the bug
Hardware details single A100
Software version OS: ubuntu 20.04 Python: 3.8.10 CUDA: 11.8 PyTorch: 1.14.0a0+410ce96 transformers: 4.29.1 accelerate: 0.19.0
To Reproduce The command: python quant_with_alpaca.py --pretrained_model_dir /myapp/HF-bloom175B/ --quantized_model_dir /myapp/HF_quantized --bits 4 --num_samples 128(or 256) But it's successful to quantize bloom 7B model.