google / gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.
Apache License 2.0
5.76k stars 487 forks source link

Add Py bindings for weight compression #290

Closed copybara-service[bot] closed 3 days ago

copybara-service[bot] commented 3 days ago

Add Py bindings for weight compression TODO: this uses clif instead of pybind11, and depends on absl.