tairov / llama2.mojo

Inference Llama 2 in one file of pure 🔥
https://www.modular.com/blog/community-spotlight-how-i-built-llama2-by-aydyn-tairov
MIT License
2.09k stars 140 forks source link

vectorize llama methods #15

Closed tairov closed 1 year ago

tairov commented 1 year ago

First attempt to migrate all possible loops to SIMD vectorization.. It shows 10% performance boost , but some parts seems buggy, since its printing wrong tokens

image
tairov commented 1 year ago

I think it's fixed now..