srush / llama2.rs

A fast llama2 decoder in pure Rust.
MIT License
1.01k stars 56 forks source link

Fix export problem when using newer version of auto_gptq #18

Closed guoqingbao closed 1 year ago

guoqingbao commented 1 year ago

There is a problem as mentioned in #17, i.e., unable to export Pytorch weights to the bin file, here is a solution to fix it.

jgreene commented 1 year ago

This fixed the export problem I ran into but then I end up getting an error when running the model.

target/release/llama2_rs llama2-70b-q.bin 0.0 11 "The only thing"
Configuration: Config { dim: 4096, hidden_dim: 11008, n_layers: 32, n_heads: 32, n_kv_heads: 32, vocab_size: 32000, seq_len: 2048, shared_weight: false }
thread 'main' panicked at src/main.rs:288:9:
assertion failed: `(left == right)`
  left: `4096`,
 right: `5120`
stack backtrace:
   0: rust_begin_unwind
             at /rustc/1b198b3a196442e14fb06978166ab46a4618d131/library/std/src/panicking.rs:617:5
   1: core::panicking::panic_fmt
             at /rustc/1b198b3a196442e14fb06978166ab46a4618d131/library/core/src/panicking.rs:67:14
   2: core::panicking::assert_failed_inner
   3: core::panicking::assert_failed
             at /rustc/1b198b3a196442e14fb06978166ab46a4618d131/library/core/src/panicking.rs:229:5
   4: llama2_rs::Config::check_static
   5: llama2_rs::Config::load
             at ./src/main.rs:312:9
   6: llama2_rs::main
             at ./src/main.rs:692:18
   7: core::ops::function::FnOnce::call_once
             at /rustc/1b198b3a196442e14fb06978166ab46a4618d131/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

which appears to be the following assert failing at line 692:

let mmap = unsafe { MmapOptions::new().offset(start).map(&file).unwrap() };
assert_eq!(mmap.len(), mem::size_of::<TWeights>());
srush commented 1 year ago

@jgreene if you are running a 7b model you need to change the .cargo/config before compiling. I will make that more clear in the instructions.

jgreene commented 1 year ago

@jgreene if you are running a 7b model you need to change the .cargo/config before compiling. I will make that more clear in the instructions.

awesome, that fixed it although I think export.py is downloading the 7b model and not the 70b model like the readme talks about.