quic / ai-hub-models

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
https://aihub.qualcomm.com
BSD 3-Clause "New" or "Revised" License
338 stars 45 forks source link

[MODEL REQUEST] Gemma 2 (9B and 27B) #60

Open EwoutH opened 1 week ago

EwoutH commented 1 week ago

Is your feature request related to a problem? Please describe. Google just released their new state-of-the-art LLM, and it looks like it's performing very well. It's available in two sizes, 9.24B and 27.2B paramaters. There are also instruction tuned variants available. The gemma license is roughly comparable with Apache 2.0.

Details of model being requested

Since the original is FP32, it might be worth thinking about how to quantize.