Gemma support - Githubissues

quic / efficient-transformers

This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transformers library) into inference-ready formats that run efficiently on Qualcomm Cloud AI 100 accelerators.

https://quic.github.io/efficient-transformers/

Other

39 stars 26 forks source link

Gemma support #102

Closed quic-meet closed 3 weeks ago

quic-meet commented 1 month ago

Added support for first generation of Gemma models
Batch generation with CB is buggy: Only generates one token per example.