issues
search
quic
/
efficient-transformers
This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transformers library) into inference-ready formats that run efficiently on Qualcomm Cloud AI 100 accelerators.
https://quic.github.io/efficient-transformers/
Other
39
stars
26
forks
source link
Gemma support
#102
Closed
quic-meet
closed
3 weeks ago
quic-meet
commented
1 month ago
Added support for first generation of Gemma models
Batch generation with CB is buggy: Only generates one token per example.