Closed matteoserva closed 1 week ago
Thank you, appreciate you filing the issue. Looks like it was indeed the softcap. Fix coming shortly, after which both queries work as expected.
I confirm that the output after https://github.com/google/gemma.cpp/pull/279 matches exactly what expected and it's aligned to all the other implementations.
Great, thank you for confirming, and reaching out with the repro case :D
Hello. Following an exchange with u/janwas_ I'm opening this problem report with the issue and the steps to reproduce.
The issue is that gemma.cpp outputs much worse results from gemma-2-27b when compared to other implementations: gemma-2 in AI studio, chatllm.cpp
The simplest question that breaks the model in gemma.cpp.
Completa la frase: tanto va la gatta al lardo che...
Gemma2 on AI studio and chatllm (at Q8_0) both reply with the only correct answer:
ci lascia lo zampino
Instead, gemma.cpp, with weights downloaded from kaggle, replies with a series of italian words that don't even create a grammatically correct sentence:
Here is the launch command used for gemma.cpp (tested also with --temperature 0.01):
./gemma --tokenizer gemma-tokenizer.spm --model 27b-it --compressed_weights ./gemma-2-27b-it-sfp.sbs
Here is another simple problem that is easily solved by gemma2 on aistudio and chatllm but can't be solved by gemma.cpp (The correct answers are 7 or 8):
All tests were done against gemma 27b. The gemma.cpp commit is the following: 8ac5d66575429c4fca19fb394c8926074352c766