google gemma_pytorch issues

google / gemma_pytorch

The official PyTorch implementation of Google's Gemma models

https://ai.google.dev/gemma

Apache License 2.0

5.16k stars 490 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Hope to See the Source Code of Gemma2 Version

#70 thefreeman007 opened 2 days ago
0
Remove unused imports

#69 neurosnap opened 5 days ago
0
Fix downcasting and upcasting similar to https://github.com/google/ge…

#68 michaelmoynihan closed 1 week ago
1
Fix downcasting and upcasting

#67 danielhanchen closed 1 week ago
1
Supporting Gemma V2

#66 michaelmoynihan closed 1 week ago
1
Update run_xla.py

#65 michaelmoynihan closed 2 weeks ago
0
gemma-2b-it-pytorch on tpu v5p

#64 shungcp closed 2 weeks ago
1
Modify SentencePiece function calls.

#63 texasmichelle closed 1 month ago
1
Change return to raise in `get_model_config`.

#62 texasmichelle closed 1 month ago
1
when to support RecurrentGemma?

#61 Mddct opened 1 month ago
0
Gemma finetuning formatting

#60 mostafamdy opened 2 months ago
0
fix missing torch in requirment

#59 Mddct closed 2 months ago
1
Add CodeGemma and HF pointers

#58 osanseviero closed 2 months ago
1
early stop when all sequence reach EOS

#57 je1lee opened 3 months ago
3
Memory saving loading weight for non-quant models

#56 KaneGreen closed 10 hours ago
5
Prepare model for deployment to Private Vertex AI endpoint

#55 BriianPowell opened 3 months ago
4
Update xla_model_parallel.py

#54 ya0guang closed 1 month ago
2
Error when run docker/Dockerfile

#53 Cguanqin opened 3 months ago
1
How to use gemma for multi-round conversations

#52 ranck626 opened 3 months ago
2
How to save memory when loading weights?

#51 KaneGreen opened 3 months ago
1
Unable to reproduce MATH resulst

#50 wenhuchen opened 3 months ago
2
fix: raise Exception

#49 leowzz opened 3 months ago
2
Is it possible to load 7b-it using quantization config

#48 aliasneo1 opened 3 months ago
1
Error when running Gemma inference on GPU

#47 LarryHawkingYoung opened 3 months ago
1
rm fairescale

#46 Mon-ius closed 3 months ago
7
I got empty result while using 7b-it model

#45 egbertwong closed 3 months ago
4
Document the existence of 99 unused tokens in the tokenizer

#44 Qubitium closed 3 months ago
1
fix(temperature): allow passing 0 or None as the temperature parameter

#43 joselpart closed 3 months ago
3
Can't disable sampling

#42 joselpart closed 3 months ago
0
Is max_position_embeddings=8096 neccessary in 2b model?

#41 agiwave opened 4 months ago
3
Auto-labels 'Gemma' on 'gemma' issues/PRs.

#40 shmishra99 closed 4 months ago
1
Objectivity

#39 o6uoq closed 4 months ago
0
How to fine-tune Gemma with pytorch?

#38 solitude-alive closed 4 months ago
2
Gemma fixes - gelu

#37 danielhanchen closed 4 months ago
4
Torch implementation now same as JAX

#36 thebraingen closed 4 months ago
1
Implementation now equals JAX

#35 thebraingen closed 4 months ago
1
Add instructions to download from Hugging Face Hub

#34 osanseviero closed 4 months ago
1
Inconsistency between PyTorch and JAX implementation

#33 aboros98 closed 4 months ago
2
"--output_len" argument ignored

#32 k-nar closed 4 months ago
1
not found weight file

#31 Cguanqin opened 4 months ago
3
is it possible to convert gemma_pytorch to onnx to tflite?

#30 nyadla-sys opened 4 months ago
2
[Question] Embeddings normalization by sqrt(hidden_size)

#29 Andrei-Aksionov closed 4 months ago
4
After deplyed google/gemma-7b-it, there always is error response.

#26 ydh10002023 opened 4 months ago
7
Cannot run on v4-16 worker 0 TPU VM: "Failed to get global TPU topology"

#25 markusheimerl opened 4 months ago
5
always loss nan while finetune a few step, wether fp32 or fp16

#24 yongzhuo closed 4 months ago
1
keras finetuning and inference examples uploaded

#23 r-gheda closed 4 months ago
2
H

#22 ZainBinTariq7 closed 4 months ago
1
Changed <2B or 7B> to <2b or 7b> in README

#21 r-gheda closed 4 months ago
0
Changes <2B or 7B> option to <2b or 7b> in README

#20 r-gheda closed 4 months ago
1
Output with higher max_length is repetition of base text

#19 azrael05 opened 4 months ago
6