quic / efficient-transformers

This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transformers library) into inference-ready formats that run efficiently on Qualcomm Cloud AI 100 accelerators.
https://quic.github.io/efficient-transformers/
Other
39 stars 26 forks source link

Awq feature #100

Closed ochougul closed 1 month ago

ochougul commented 1 month ago

Closed 91 for GPTQ PR to be up.