google / gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.
Apache License 2.0
5.94k stars 502 forks source link

Add bf16 matmul support, update naming+test #205

Closed copybara-service[bot] closed 4 months ago

copybara-service[bot] commented 4 months ago

Add bf16 matmul support, update naming+test

Avoid int32, which can easily overflow for large matrices. Also fix IDE warning in sfp-inl.