srush / llama2.rs

A fast llama2 decoder in pure Rust.
MIT License
995 stars 54 forks source link

Made Python support optional #30

Closed rachtsingh closed 10 months ago

rachtsingh commented 10 months ago

This wraps a bunch of the Python scaffolding behind a feature python so you can compile the binary without pyo3 at all. The default pip install . and maturin build commands use the python feature so there's no change necessary to README, I think.

Shrinks the binary from 38MB to 31MB on my machine.

srush commented 10 months ago

Nice. I'll do the same for CUDA support.