google / gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.
Apache License 2.0
5.9k stars 499 forks source link

Fix kv offset computation for MHA config. #172

Closed szabadka closed 4 months ago