tairov / llama2.mojo

Inference Llama 2 in one file of pure 🔥
https://www.modular.com/blog/community-spotlight-how-i-built-llama2-by-aydyn-tairov
MIT License
2.09k stars 140 forks source link

Better tokenizer and tinyllama-1.1B supported #30

Closed magician-blue closed 11 months ago

tairov commented 11 months ago

I like it!

The Pythagorean theorem can be expressed as:

c^2 = a^2 + b^2
image
tairov commented 11 months ago

@magician-blue please take a look at latest commit https://github.com/tairov/llama2.mojo/pull/30/commits/6479fbe9f711afce63e59e88af9fc1952dce51fa

I've added option -a for configuring model architecture via cli. I think it make sense, because we already can customize tokenizer via -z

Also, I tried get rid of ugly switches in the middle for fn transformer, I defined separate rope_rotation functions that are loaded once architecture is detected before the main loop.

    let _transformer = transformer[rope_rotation_tinyllama] if arch == 'tinyllama' 
                                                            else transformer[rope_rotation_llama2c]

I think this will allow to keep the code generally clean and simple, and will also add the necessary flexibility to further expand the functionality via setting custom architectures.

So, far it looks really cool. We can add support for more models. Anyway, thank you for colloborating.

BTW, are you able to squash many of your commits into 1 commit ? So that the git history will be clean

Let's workout additional details in this PR and I'm happy to merge, once we'll make sure it will be working fine out-of-the box

tairov commented 11 months ago

Cleaned up git history ( it contained 50mb stories15m apparently )

tairov commented 11 months ago

If you'll see conflicts, it could be resolved by resetting to your origin test

git fetch origin
git reset --hard origin/test