NetEase-FuXi / EETQ

Easy and Efficient Quantization for Transformers
Apache License 2.0
180 stars 14 forks source link

Does it support Whisper model? #26

Closed kadirnar closed 3 months ago

SidaZh commented 4 months ago

In theory, eetq supports all models supported by transformers, you can try this:

from transformers import AutoModelForCausalLM, EetqConfig

path = "/path_to_model"
quantization_config = EetqConfig("int8")
model = AutoModelForCausalLM.from_pretrained(path, device_map="auto", quantization_config=quantization_config)
kadirnar commented 4 months ago

I tested it and it works. What can I do to optimize further? Have you tested with Torch.compile?