Open MackNcD opened 1 year ago
Imagine 250x speed on the original...
Probably you've noticed, this original means llama2.py, not llama2.c, I am interested in Mojo, though it requires minimum 8GiB RAM only for SDK.
The C version is faster when using multi-threading :fire:
https://github.com/tairov/llama2.mojo
Imagine 250x speed on the original...