quic / efficient-transformers

This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transformers library) into inference-ready formats that run efficiently on Qualcomm Cloud AI 100 accelerators.
https://quic.github.io/efficient-transformers/
Other
39 stars 26 forks source link

Hotfix/Restoring older release functionality for VLLM #110

Closed ochougul closed 3 weeks ago