Open niklasha opened 2 months ago
Thanks for reporting this, I think it's an issue that only happens on the 7b because of MQA (which is not present on the 2b version which was used for testing), could you give a try to #2091 , hopefully this should provide the appropriate fix.
I have tested, and it does not crash anymore, thanks, and the output matches "--cpu". However the quality of the response to the example prompt is pretty low, subjectively. But that is not the key issue here I guess :-)
Glad that it helped. Did you make sure to respect the prompt format? This example is very barebone and doesn't do it for you. https://huggingface.co/blog/codegemma#prompt-format
Aha! thanks, well I just was testing and did not do my homework. No I did not respect the prompt format :-)
cargo run --features metal --example gemma -- --which code-7b-it --prompt "explain isakmpd's architecture"
fails with:The prompt is not of great importance, other prompts just give different strides, but fails equally. I did look into this a bit, but I confess it sort of goes over my current competence. I thought the stride vector always should be decreasing, but the rhs stride info is, as can be seen [36864, 256, 4096, 1], which does not fit into my mental model. However the running with "--cpu" does accept this. I am still sceptic it does the math correctly, since it too seems to get the same striding, but it may be I that misunderstand the concept.