issues
search
google
/
gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
Apache License 2.0
5.8k
stars
491
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Use all CPU sockets when pinning threads to cores
#319
copybara-service[bot]
opened
1 day ago
1
Parallel prefill seems to lead to completely different results
#318
ufownl
opened
1 day ago
0
Split up ops.h into ops/ops-inl and matmul-inl
#317
copybara-service[bot]
closed
1 day ago
0
Cleanup: add wrapper functions and rename vars to interleaved
#316
copybara-service[bot]
closed
1 day ago
0
Major Prefill/Generate cleanup, 1.3x Prefill speedup
#315
copybara-service[bot]
closed
2 days ago
0
Fix msan uninitialized scale
#314
copybara-service[bot]
closed
2 days ago
0
Add scale parameter to MatMul.
#313
copybara-service[bot]
closed
2 days ago
0
Update gemma-27b to the correct query scaling.
#312
copybara-service[bot]
closed
3 days ago
0
Simplify FFW by using MatMul_4x4_Batch_Add.
#311
copybara-service[bot]
closed
4 days ago
0
De-templatize Activations, add RowVectorBatch class
#310
copybara-service[bot]
closed
3 days ago
0
Fix examples/hello_world for real.
#309
copybara-service[bot]
closed
5 days ago
0
Further 1.02x prefill speedup from batch 64->512
#308
copybara-service[bot]
closed
5 days ago
0
Fix gemma_cpp/examples/hello_world build.
#307
copybara-service[bot]
closed
5 days ago
0
Increase the prefill batch size to 64.
#306
copybara-service[bot]
closed
1 week ago
1
SVE build fix: avoid capturing vectors directly.
#305
copybara-service[bot]
closed
1 week ago
0
Simplify matmul: only 2 overloads
#304
copybara-service[bot]
closed
1 week ago
0
Remove allocation from GEMM_4x4_Tile when decoding compressed weights by implementing
#303
copybara-service[bot]
closed
1 week ago
0
Improve readability with RepeatedAttentionWindowSizes
#302
copybara-service[bot]
closed
1 week ago
1
Convert recurrentgemma weights
#301
0wwafa
opened
1 week ago
3
Record time measurements in MatMul tests.
#300
copybara-service[bot]
closed
1 week ago
2
Fix windows build: min conflict, unused VF
#299
copybara-service[bot]
closed
1 week ago
0
Add more comments to attention computation (and some small restructuring).
#298
copybara-service[bot]
closed
1 week ago
0
Refactor configurables.
#297
copybara-service[bot]
closed
1 week ago
0
Update gemma_test to also pass for the v1.1. models.
#296
copybara-service[bot]
closed
1 week ago
0
Lint fix - string append, remove stale TODO
#295
copybara-service[bot]
closed
1 week ago
0
Update gemma_test with the expected entropy values for the IT models of size 2B/7B/9B/27B.
#294
copybara-service[bot]
closed
2 weeks ago
0
Move benchmark_helper to evals/, weights_raw to compression/.
#293
copybara-service[bot]
closed
1 week ago
0
Fix handling of %c and %q if eot_string. Fixes #283, thanks @ljcucc
#292
copybara-service[bot]
closed
2 weeks ago
0
Cleanup: move util/compress and convert_weights to compression/
#291
copybara-service[bot]
closed
2 weeks ago
0
Add Py bindings for weight compression
#290
copybara-service[bot]
closed
2 weeks ago
0
Fix gemma_test - moved to evals/.
#289
copybara-service[bot]
closed
2 weeks ago
0
7x compile time speedup: shard gemma.cc
#288
copybara-service[bot]
closed
2 weeks ago
0
Add configurables for norm/rope/activation/scale/residual connection.
#287
copybara-service[bot]
opened
2 weeks ago
0
Small cleanups. Fixes gemma_test build.
#286
copybara-service[bot]
closed
2 weeks ago
0
Prep for sharding gemma.cc: split into kv_cache, tokenizer.
#284
copybara-service[bot]
closed
2 weeks ago
0
The %C and %Q will not detected when eot_line = "other string"
#283
ljcucc
closed
1 week ago
1
Use benchmark_helper in py bindings (adds BOS)
#282
copybara-service[bot]
closed
2 weeks ago
0
Cleanup: add ModelInfo struct, remove gcpp::
#281
copybara-service[bot]
closed
2 weeks ago
0
Add sliding window attention for Gemma 2.
#280
copybara-service[bot]
closed
2 weeks ago
1
Add config for att/final cap, skip max-subtract. Fixes #278
#279
copybara-service[bot]
closed
2 weeks ago
0
low quality responses from gemma.cpp (gemma-2-27b) when compared to AIstudio and others
#278
matteoserva
closed
2 weeks ago
3
Declutter gemma/ directory, move binaries to evals/ and util/.
#277
copybara-service[bot]
closed
2 weeks ago
0
There is an extra `<end_of_turn>\n` in the output
#276
ufownl
closed
1 week ago
2
Remove unused kSystemPrompt
#275
copybara-service[bot]
closed
2 weeks ago
0
Introduce new Gemma 9B and 27B configs
#274
copybara-service[bot]
closed
3 weeks ago
0
Refactor model type / training tables, simplify reverse mapping
#273
copybara-service[bot]
closed
3 weeks ago
0
Remove unused BUILD dependency
#272
copybara-service[bot]
closed
3 weeks ago
0
Fix a clang tidy warning
#271
copybara-service[bot]
closed
3 weeks ago
1
Improve logging when running Gemma examples: fix the issue when max_tokens, max_generated_tokens and temperature were logging without any trailing space/newline.
#270
copybara-service[bot]
closed
3 weeks ago
1
Add prompt batching to Gemma.cpp.
#269
copybara-service[bot]
closed
2 weeks ago
1
Next