mit-han-lab / smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
https://arxiv.org/abs/2211.10438
MIT License
1.1k stars 127 forks source link

which version of transformer and datasets package do we need for this repo? #81

Open ghost opened 3 months ago

ghost commented 3 months ago

I got this error

AttributeError: 'Int8OPTDecoder' object has no attribute '_use_flash_attention_2'

when I run smoothquant_opt_real_int8 and the eval for model_smoothquant = Int8OPTForCausalLM.from_pretrained('mit-han-lab/opt-30b-smoothquant', torch_dtype=torch.float16, device_map='auto')

I first got the error of "AttributeError: type object 'OPTDecoder' has no attribute '_prepare_decoder_attention_mask'" so I commented out "OPTDecoder._prepare_decoder_attention_mask" from opt.py file. Then this error occured.

If I simply add self._use_flash_attention_2 = False to 'Int8OPTDecoder' class, what would happen?

What's the right way of fixing it?

ghost commented 3 months ago

I installed transformers==4.34.0 and datasets==2.14.5 and it worked

msz12345 commented 3 months ago

congratulations! I just want to say maybe you can try use transformers==4.33.0,about the error use_flash_attention2 we tried many ways but this make it at last.