srush / llama2.rs

A fast llama2 decoder in pure Rust.
MIT License
995 stars 54 forks source link

new export script #24

Closed rachtsingh closed 10 months ago

rachtsingh commented 10 months ago

It produces the same output on TheBloke/Llama-2-13B-Guanaco-QLoRA-GPTQ main, but it should fix the second error in https://github.com/srush/llama2.rs/issues/23.

The first error seems related to https://huggingface.co/blog/gptq-integration.