srush / llama2.rs

A fast llama2 decoder in pure Rust.
MIT License
1.01k stars 56 forks source link

Exported Models do not load #44

Closed AtlasPilotPuppy closed 10 months ago

AtlasPilotPuppy commented 10 months ago

I exported the following model:

gsaivinay/Llama-2-7b-Chat-GPTQ on the main branch.

using the export script.

The model is a 7B model with a group size of 128.

I compiled the rust binary with the following appropriate parameters

cargo build --release --features 7B,group_128,python

and then ran

pip install .

I get the following error trying to import the model:

In [2]:  model = llama2_rs.LlamaModel("export.bin", False)
thread '<unnamed>' panicked at src/lib.rs:97:13:
assertion `left == right` failed
  left: 4520919044
 right: 26954711044
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---------------------------------------------------------------------------
PanicException                            Traceback (most recent call last)
Cell In[2], line 1
----> 1 model = llama2_rs.LlamaModel("export.bin", False)

PanicException: assertion `left == right` failed
  left: 4520919044
 right: 26954711044

I have tried this with a few different models, group sizes and parameter sizes and still get the same error. Any guidance on debugging this will be quite helpful.

srush commented 10 months ago

I apologize. You now need an additional argument "quantized" and it should work. README is updated

However for some reason that particular repo is producing bad output for me. Need to check why.

srush commented 10 months ago

okay, fixed it. Needed 0.3 version of auto-gpt.