issues
search
google
/
gemma.cpp
lightweight, standalone C++ inference engine for Google's Gemma models.
Apache License 2.0
5.96k
stars
506
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
There is an extra `<end_of_turn>\n` in the output
#276
ufownl
closed
3 months ago
2
Remove unused kSystemPrompt
#275
copybara-service[bot]
closed
3 months ago
0
Introduce new Gemma 9B and 27B configs
#274
copybara-service[bot]
closed
3 months ago
0
Refactor model type / training tables, simplify reverse mapping
#273
copybara-service[bot]
closed
3 months ago
0
Remove unused BUILD dependency
#272
copybara-service[bot]
closed
3 months ago
0
Fix a clang tidy warning
#271
copybara-service[bot]
closed
3 months ago
1
Improve logging when running Gemma examples: fix the issue when max_tokens, max_generated_tokens and temperature were logging without any trailing space/newline.
#270
copybara-service[bot]
closed
3 months ago
1
Add prompt batching to Gemma.cpp.
#269
copybara-service[bot]
closed
3 months ago
1
Skip the last RMSNormInplaceBatched in the Prefill phase.
#268
copybara-service[bot]
closed
3 months ago
1
Fix compilation errors in clang
#267
ufownl
closed
3 months ago
1
Fix KV cache size calculation error
#266
ufownl
closed
3 months ago
0
Fixing two typos.
#265
copybara-service[bot]
closed
3 months ago
0
Code cleanup
#264
copybara-service[bot]
closed
3 months ago
0
Move test placeholder to a later pos.
#263
copybara-service[bot]
closed
3 months ago
1
Refactor kCachePosSize and kCacheLayerSize into separate functors.
#262
copybara-service[bot]
closed
3 months ago
1
Split out common parts (embedder and transformer block) from Prefill() and Transformer() into separate functions.
#261
copybara-service[bot]
closed
3 months ago
1
Move kGriffinLayers into ConfigNoSSM, set kGemmaLayers directly
#260
copybara-service[bot]
closed
3 months ago
0
Fix debug_prompt and other binaries (internal init)
#259
copybara-service[bot]
closed
3 months ago
0
Simplify Attention.
#258
copybara-service[bot]
closed
3 months ago
0
Fix Py binding/run_example: use GemmaEnv
#257
copybara-service[bot]
closed
3 months ago
0
1.15x 7b sfp prefill speedup: Matmul in attention
#256
copybara-service[bot]
closed
3 months ago
0
Update developer docs and mention asan/msan
#255
copybara-service[bot]
closed
3 months ago
0
Further simplification to ForEachTensor, thanks I.K.
#254
copybara-service[bot]
closed
3 months ago
0
Fix DASSERT - TiledBatch requires at least 2 vectors.
#253
copybara-service[bot]
closed
3 months ago
0
RecurrentGemma 9b support
#252
fizzAI
opened
3 months ago
1
Use hwy::ThreadPool::MaxThreads() to determine the number of threads to use.
#251
copybara-service[bot]
closed
3 months ago
1
docs: update README.md
#250
eltociear
opened
4 months ago
0
Move raw_weights into separate header, used mainly by compress_weights.
#249
copybara-service[bot]
closed
3 months ago
0
Refactor CompressedWeights.
#248
copybara-service[bot]
closed
3 months ago
1
Added bias vector addition to MatMul
#247
copybara-service[bot]
closed
4 months ago
0
Removed now redundant non-batch matmul
#246
copybara-service[bot]
closed
4 months ago
0
Implement a missing (bf16, f32) tiled MatMul kernel.
#245
copybara-service[bot]
closed
4 months ago
0
Internal change.
#244
copybara-service[bot]
closed
4 months ago
1
Integrate matmul into FFW: 4.3x prefill speedup
#243
copybara-service[bot]
closed
4 months ago
0
Reduce duplication in Config* by inheriting no-SSM
#242
copybara-service[bot]
closed
4 months ago
0
Added MatMul_4x4_Batch which is MatMul_4x4, but with the first template arg moved to the first function arg, so the batch size (num A rows) can be variable at run-time.
#241
copybara-service[bot]
closed
4 months ago
0
Major duplicated code reduction in test/benchmarks
#240
copybara-service[bot]
closed
4 months ago
0
Tiny cleanup: distinguish between "ids" and "pieces" in argument names when encoding.
#239
copybara-service[bot]
closed
4 months ago
0
Extends Transformer() to prepare for batched processing.
#238
copybara-service[bot]
closed
4 months ago
0
Support mixed (bf16, sfp) tiled MatMul. Same sfp-decompress strategy as in (f32,
#237
copybara-service[bot]
closed
4 months ago
1
Fix numerical issue in Softcap by subtracting max.
#236
copybara-service[bot]
closed
4 months ago
1
Fix numerical issue in Softcap by subtracting max.
#235
copybara-service[bot]
closed
4 months ago
0
Add benchmark dependency to cmake build.
#234
szabadka
closed
4 months ago
0
Increase parallelism in ops_test
#233
copybara-service[bot]
closed
4 months ago
1
Add internal initialization code to debug_prompt.
#232
copybara-service[bot]
closed
4 months ago
0
Implement float * SfpStream matmul by decompressing 4 * kColsA_RowsB -sized chunks of the second matrix.
#231
copybara-service[bot]
closed
4 months ago
1
Update AssertClose for large matrices and add large matrix test
#230
copybara-service[bot]
closed
4 months ago
0
Updated benchmarks.cc to recent changes to Gemma API.
#229
copybara-service[bot]
closed
4 months ago
1
Add compression/ comments, especially on SFP range
#228
copybara-service[bot]
closed
4 months ago
0
Use Loader/AppArgs to construct gemma_test model, simplify AcceptFunc
#227
copybara-service[bot]
closed
4 months ago
0
Previous
Next