I was trying out llama2.rs and wanted to swap between the 7B/13B versions on the fly, and I think using conditional compilation here makes it a bit easier.
I also spent some time trying to optimize the SIMD but it seems really fast! I couldn't find any easy optimizations.
Hey Sasha,
I was trying out llama2.rs and wanted to swap between the 7B/13B versions on the fly, and I think using conditional compilation here makes it a bit easier.
I also spent some time trying to optimize the SIMD but it seems really fast! I couldn't find any easy optimizations.