Open ghost opened 8 months ago
I installed transformers==4.34.0 and datasets==2.14.5 and it worked
congratulations! I just want to say maybe you can try use transformers==4.33.0,about the error use_flash_attention2 we tried many ways but this make it at last.
I got this error
AttributeError: 'Int8OPTDecoder' object has no attribute '_use_flash_attention_2'
when I run smoothquant_opt_real_int8 and the eval for model_smoothquant = Int8OPTForCausalLM.from_pretrained('mit-han-lab/opt-30b-smoothquant', torch_dtype=torch.float16, device_map='auto')
I first got the error of "AttributeError: type object 'OPTDecoder' has no attribute '_prepare_decoder_attention_mask'" so I commented out "OPTDecoder._prepare_decoder_attention_mask" from opt.py file. Then this error occured.
If I simply add self._use_flash_attention_2 = False to 'Int8OPTDecoder' class, what would happen?
What's the right way of fixing it?