edgenai / llama_cpp-rs

High-level, optionally asynchronous Rust bindings to llama.cpp
Apache License 2.0
160 stars 32 forks source link

getting called `Result::unwrap()` on an `Err` value: DecodeFailed(1) while building chat example. #76

Open nayan9800 opened 4 months ago

nayan9800 commented 4 months ago

Hi there, i am trying to build simple chat example with the llma_cpp-rs crate, following is my code


use std::io;
use std::io::Write;

use llama_cpp::standard_sampler::StandardSampler;
use llama_cpp::LlamaModel;
use llama_cpp::LlamaParams;
use llama_cpp::LlamaSession;
use llama_cpp::SessionParams;

fn main() {

    let model = LlamaModel::load_from_file("data/Meta-Llama-3-8B-Instruct.Q5_K_S.gguf", LlamaParams::default())
                                                                .expect("unable to load model");

    let mut ctx = model.create_session(SessionParams::default())
                                                                .expect("failed to create session");

loop {
    println!("YOU:");

    let mut promt = String::new();
    io::stdin().read_line(&mut promt).unwrap();

    println!("BOT:");
    chat(&mut ctx, promt);
    println!("");
}

}

fn chat(session:&mut LlamaSession,promt: String) {

    session.advance_context(promt).unwrap();

    let max_tokens = 2048;
    let mut decoded_tokens = 0;

// `ctx.start_completing_with` creates a worker thread that generates tokens. When the completion
// handle is dropped, tokens stop generating!
let completions = session.start_completing_with(StandardSampler::default(), 1024).into_strings();

for completion in completions {
    print!("{completion}");
    let _ = io::stdout().flush();

    decoded_tokens += 1;

    if decoded_tokens > max_tokens {
        return;
    }

    }
}

but after some time while giving next prompt i am facing below issue.

thread 'main' panicked at src/main.rs:44:36:
called `Result::unwrap()` on an `Err` value: DecodeFailed(1)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

it will helpful if someone guide me, why i am facing this error.

as for the model i am using the Meta-Llama-3-8B-Instruct-GGUF (Meta-Llama-3-8B-Instruct.Q5_K_S.gguf) from following link https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/tree/main

pedro-devv commented 4 months ago

Sorry for the delay in replying. Whatever is failing is in the C++ side of things, could you please use tracing and post the output?

richard-6 commented 2 months ago

Hi! I am experiencing the same issue trying to build a chat example, where one session is used for the entirety of the chat. Currently using version 0.3.2 of the llama_cpp crate, with the phi-2.Q5_K_M model that I downloaded here: https://huggingface.co/TheBloke/phi-2-GGUF

I have included the full backtrace below. Please let me know if there is anything else I can share to help with debugging this. Thanks.

Full backtrace:


called `Result::unwrap()` on an `Err` value: DecodeFailed(1)
stack backtrace:
   0:        0x102a73218 - std::backtrace_rs::backtrace::libunwind::trace::h2966c6fbfac9d426
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
   1:        0x102a73218 - std::backtrace_rs::backtrace::trace_unsynchronized::h8a5f4aefe890b7c5
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:        0x102a73218 - std::sys_common::backtrace::_print_fmt::h7574dd98fd39c257
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:67:5
   3:        0x102a73218 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h298c9ab285ff3934
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:44:22
   4:        0x102a8e2d4 - core::fmt::rt::Argument::fmt::hf9661447f7b99899
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/fmt/rt.rs:142:9
   5:        0x102a8e2d4 - core::fmt::write::h4e276abdb6d0c2a1
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/fmt/mod.rs:1120:17
   6:        0x102a70da0 - std::io::Write::write_fmt::hd421848f5f0bf9d0
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/io/mod.rs:1762:15
   7:        0x102a73054 - std::sys_common::backtrace::_print::h09e653c6686dbd70
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:47:5
   8:        0x102a73054 - std::sys_common::backtrace::print::hd8bd9ecab1f94b94
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:34:9
   9:        0x102a74560 - std::panicking::default_hook::{{closure}}::h520eeb743fc98fb4
  10:        0x102a742a8 - std::panicking::default_hook::ha6550ffe49b63df1
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:292:9
  11:        0x102a74988 - std::panicking::rust_panic_with_hook::hddb0e884a202de7c
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:779:13
  12:        0x102a74888 - std::panicking::begin_panic_handler::{{closure}}::hd2798398a2fd9077
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:657:13
  13:        0x102a73680 - std::sys_common::backtrace::__rust_end_short_backtrace::h9201cc364dbb8a23
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys_common/backtrace.rs:170:18
  14:        0x102a74624 - rust_begin_unwind
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
  15:        0x102ab041c - core::panicking::panic_fmt::h4d5168028d4c43c7
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
  16:        0x102ab0784 - core::result::unwrap_failed::hc60ef978ea39e1b4
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/result.rs:1653:5
  17:        0x1028d4fb4 - yellm::chat::hf80ece9a803b0b1d
  18:        0x1028d55f8 - yellm::main::ha5c23690e413c5ee
  19:        0x1028d12e4 - std::sys_common::backtrace::__rust_begin_short_backtrace::he2c0d87ee6179d52
  20:        0x1028e3cd4 - std::rt::lang_start::{{closure}}::h1f492a4d96b5b888
  21:        0x102a6ce20 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h1a7c0e059d971da5
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:284:13
  22:        0x102a6ce20 - std::panicking::try::do_call::h07a34a23e615022b
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
  23:        0x102a6ce20 - std::panicking::try::h1111644420b4cc09
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  24:        0x102a6ce20 - std::panic::catch_unwind::h31a3b9d6e2ef9973
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  25:        0x102a6ce20 - std::rt::lang_start_internal::{{closure}}::h63c3452500a36531
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:48
  26:        0x102a6ce20 - std::panicking::try::do_call::h9c5c8a2a0a297bb7
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:552:40
  27:        0x102a6ce20 - std::panicking::try::h424cfcafca1bde97
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:516:19
  28:        0x102a6ce20 - std::panic::catch_unwind::h345d3d448041017f
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panic.rs:142:14
  29:        0x102a6ce20 - std::rt::lang_start_internal::h5b246d44f1526226
                               at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/rt.rs:148:20
  30:        0x1028e3cac - std::rt::lang_start::hf629135da24ee9a5
  31:        0x1028d5c60 - _main```