issues
search
google
/
gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
Apache License 2.0
5.8k
stars
491
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Small code cleanup suggestions while reading the code.
#218
copybara-service[bot]
closed
1 month ago
0
Add support for custom sampling function to runtime config.
#217
szabadka
closed
1 month ago
0
Fix fix for weight type define, refs #198
#216
copybara-service[bot]
closed
1 month ago
0
Fix reference to GEMMA_WEIGHT_T. Refs #198
#215
copybara-service[bot]
closed
1 month ago
0
Toward only using compressed weights:
#214
copybara-service[bot]
closed
1 month ago
0
Fix Softmax on SVE
#213
copybara-service[bot]
closed
1 month ago
0
Add Adam optimizer.
#212
szabadka
closed
1 month ago
0
Internal experiment
#211
copybara-service[bot]
closed
1 month ago
0
Implement mixed mode matmul: f32 * bf16
#210
copybara-service[bot]
closed
1 month ago
1
Simplifications: remove GemmaInterface and GemmaImpl
#209
copybara-service[bot]
closed
1 month ago
0
Remove no longer required stats.h - use Highway version instead
#208
copybara-service[bot]
closed
1 month ago
0
revert back to HWY_ASSERT for lane constraints, qualify hn::Add
#207
copybara-service[bot]
closed
1 month ago
0
Fix for GenerateZeroMat call in TestTiledMatMul
#206
copybara-service[bot]
closed
1 month ago
0
Add bf16 matmul support, update naming+test
#205
copybara-service[bot]
closed
1 month ago
0
Use system topology to pin threads across clusters.
#204
copybara-service[bot]
closed
1 month ago
0
Add first version of backpropagation support.
#203
szabadka
closed
1 month ago
1
Refactor GemmaImpl dispatch to use Highway 1.2's HWY_DYNAMIC_DISPATCH_T
#202
copybara-service[bot]
closed
1 month ago
0
Update to Highway 1.2 for topology/VQSelect
#201
copybara-service[bot]
closed
1 month ago
0
static_assert shape constraints in MatMul 4x4
#200
copybara-service[bot]
closed
1 month ago
0
Unrolled / tiled 4x4 MatMul
#199
copybara-service[bot]
closed
1 month ago
0
Gemma.cpp hangs on a Gemma 7B model that was finetuned using huggingface peft(QLoRA)
#198
webbigdata-jp
opened
1 month ago
13
Compilation fails for raspberry pi
#197
EphemeralSapient
closed
1 month ago
2
gemma.cc:1322: Failed to load model weight
#196
ordentid
closed
1 month ago
6
Generic MHA/MQA/GQA implementation
#195
copybara-service[bot]
closed
1 month ago
0
Fix normalization in Softmax function.
#194
szabadka
closed
1 month ago
0
Compiling under mingw with clang error..
#193
0wwafa
closed
5 days ago
6
Documenting the RoPE implementation.
#192
copybara-service[bot]
closed
1 month ago
0
Minor internal refactoring.
#191
copybara-service[bot]
closed
2 months ago
0
Add MMLU eval to github
#190
copybara-service[bot]
closed
2 months ago
0
Adds Kaggle testing to CI workflow
#189
pculliton
closed
2 months ago
0
Make BlobWriter::Add() accept const void*
#188
copybara-service[bot]
closed
2 months ago
0
Refer to --weights rather than --compressed_weights to simplify CLI docs
#187
copybara-service[bot]
closed
2 months ago
0
Add TTFT to TimingInfo
#186
copybara-service[bot]
closed
2 months ago
0
Paligemma Support
#185
okpatil4u
opened
2 months ago
1
Pass most runtime parameters using const RuntimeConfig&
#184
copybara-service[bot]
closed
2 months ago
0
Store tokens/sec in auxiliary struct TimingInfo.
#183
copybara-service[bot]
closed
2 months ago
1
Fix SVE build: add missing hn::
#182
copybara-service[bot]
closed
2 months ago
0
Support additional scaling
#181
copybara-service[bot]
closed
2 months ago
0
Enable even/odd for SFP. Refs #166
#180
copybara-service[bot]
closed
2 months ago
0
Fix RecurrentGemma (refs #166) - one Dot was ignoring scale.
#179
copybara-service[bot]
closed
2 months ago
0
2x speedup of SFP decode (1.4x overall) on AVX3_DL+.
#178
copybara-service[bot]
closed
2 months ago
0
Use more parallelism in attention block in prefill mode.
#177
szabadka
closed
2 months ago
0
Use more parallelism in the QKV projections of the MHA block.
#176
szabadka
closed
2 months ago
1
Use more parallelism in the final output of the attention block.
#175
szabadka
closed
2 months ago
0
Matmul and test functions
#174
copybara-service[bot]
closed
2 months ago
0
Add per-thread even_odd storage for #166.
#173
copybara-service[bot]
closed
2 months ago
0
Fix kv offset computation for MHA config.
#172
szabadka
closed
2 months ago
0
Use a MatMul implementation over MatVec for Prefill Computations
#171
austinvhuang
closed
1 week ago
3
Use more parallelism in the QKV projections in MQA mode.
#170
szabadka
closed
2 months ago
2
work with cmake install
#169
xinpingwang
closed
2 months ago
6
Previous
Next