How it relates to LLAMA?

tairov / llama2.mojo

Inference Llama 2 in one file of pure 🔥

MIT License

2.09k stars 140 forks source link

Ok, I installed mojo, cloned your repo and run the test. It works, congrats! But how all of this relates to LLAMA? Nothing happened when I was trying to run the LLAMA2 itself:

alex@NLDW4-5-20-11:~/ai/llama2.mojo$ mojo llama2.mojo ~/ai/llama.cpp/models/ggml-model-q4_1.bin -s 100 -n 256 -t 0.5 -i "Llama is an animal" num hardware threads: 12 SIMD vector width: 16 checkpoint size: 4238459520 Killed alex@NLDW4-5-20-11:~/ai/llama2.mojo$ mojo llama2.mojo ~/ai/llama.cpp/models/ggml-model-q4_1.bin -s 100 -n 256 -t 4 -i "Llama is an animal" num hardware threads: 12 SIMD vector width: 16 checkpoint size: 4238459520 Killed

I don't know what does it mean -t 0.5 (I suppose threads), I've been trying -t 4 and again without results.

The the clue here is how to run LLAMA 2 using this new language called MOJO. And if you made a MOJO wrapper for the LLAMA/LLAMA2 models, please provide the instruction on how to run the model using this wrapper.

Thank you.

Hi @alexcardo , thanks for you question.

So, basically I would recommend to learn more about tiny-lms https://github.com/karpathy/llama2.c

LLAMA is just an architecture based on transformers. llama.cpp is an implementation of LLAMA architecture inference that's goal is to inference the models on consumers hardware via quantization and exporting original llama weights to ggml/gguf, that's is a seprate format of storing weights. So it's not compatible with llama2 at the moment. Also the quantized models are not compatible with llama2.c.

So essentially currently our goal is to not implement full fledged inference of the original or even quantized models. I would say for now the purpose and interest is pure academic learnings

See the quote from Karpathy:

Compared to llama.cpp, I wanted something super simple, minimal, and educational so I chose to hard-code the Llama 2 architecture and just roll one inference file of pure C with no dependencies.

tairov / llama2.mojo

How it relates to LLAMA? #22