tairov / llama2.mojo

Inference Llama 2 in one file of pure 🔥
https://www.modular.com/blog/community-spotlight-how-i-built-llama2-by-aydyn-tairov
MIT License
2.09k stars 140 forks source link

Getting very strange response when trying the second example in README.md #37

Closed wongni closed 11 months ago

wongni commented 11 months ago
~/src/AI/mojo/llama2.mojo$ mojo llama2.mojo tl-chat.bin \
    -r falcon \
    -z tok_tl-chat.bin \
    -n 256 -t 0 -s 100 -i "<|im_start|>user\nGive me a python function to generate Fibonacci sequence<|im_end|>\n<|im_start|>assistant\n"
num hardware threads:  12
SIMD vector width:  16
checkpoint size:  4400767004 [ 4196 MB ]
n layers:  22
vocab size:  32003
<|im_start|>user
Give me a python function to generate Fibonacci sequence<|im_end|>
<|im_start|>assistant
¿Quiero debera.io|efes<|
|- [aquíntena|
|-|re|re|
|-|
|-ichas|[estructurañiñu|implementa.py|
|esínda|
¿Quiero|

|Olahi|

Does anyone know how to resolve this?

tairov commented 11 months ago

I guess the model already updated on Huggingface. See this PR : https://github.com/tairov/llama2.mojo/pull/35

tairov commented 11 months ago

As a quick fix , just try the -r llama cli param