evilsocket / cake

Distributed LLM and StableDiffusion inference for mobile, desktop and server.
Other
2.57k stars 135 forks source link

第二次请求会报错 #19

Closed JKYtydt closed 3 months ago

JKYtydt commented 3 months ago

您好,第一次请求的时候会正常输出,第二次请求会报错,主节点的服务也会终止 工作节点运行命令

CUDA_VISIBLE_DEVICES=3 ./cake-cli --model /sdc/pre_trained_model/Llama3-Chinese-8B-Instruct --mode worker --name worker0 --topology /sdc/jky/cake/topology.yml --address 0.0.0.0:10128

主节点运行命令

CUDA_VISIBLE_DEVICES=3,4,5,6,7 ./cake-cli --model /home/pre_trained_model/Llama3-Chinese-8B-Instruct --api 0.0.0.0:8080 --topology /home/jky/cake/topology.yml

报错如下:

thread 'tokio-runtime-worker' panicked at /sdc/jky/cake/cake-core/src/cake/worker.rs:215:26:
called `Result::unwrap()` on an `Err` value: cannot broadcast [29, 29] to [1, 32, 29, 170]
   0: candle_core::error::Error::bt
   1: candle_core::layout::Layout::broadcast_as
   2: candle_core::tensor::Tensor::broadcast_as
   3: cake_core::models::llama3::cache::Cache::apply_attention_mask
   4: cake_core::models::llama3::attention::CausalSelfAttention::forward
   5: <cake_core::models::llama3::transformer::Transformer as cake_core::cake::Forwarder>::forward::{{closure}}
   6: cake_core::cake::worker::Worker<G>::run::{{closure}}::{{closure}}
   7: tokio::runtime::task::core::Core<T,S>::poll
   8: tokio::runtime::task::harness::Harness<T,S>::poll
   9: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  10: tokio::runtime::scheduler::multi_thread::worker::Context::run
  11: tokio::runtime::context::set_scheduler
  12: tokio::runtime::context::runtime::enter_runtime
  13: tokio::runtime::scheduler::multi_thread::worker::run
  14: tokio::runtime::task::core::Core<T,S>::poll
  15: tokio::runtime::task::harness::Harness<T,S>::poll
  16: tokio::runtime::blocking::pool::Inner::run
  17: std::sys_common::backtrace::__rust_begin_short_backtrace
  18: core::ops::function::FnOnce::call_once{{vtable.shim}}
  19: std::sys::pal::unix::thread::Thread::new::thread_start
  20: <unknown>
  21: <unknown>

Stack backtrace:
   0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from
   1: <cake_core::models::llama3::transformer::Transformer as cake_core::cake::Forwarder>::forward::{{closure}}
   2: cake_core::cake::worker::Worker<G>::run::{{closure}}::{{closure}}
   3: tokio::runtime::task::core::Core<T,S>::poll
   4: tokio::runtime::task::harness::Harness<T,S>::poll
   5: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
   6: tokio::runtime::scheduler::multi_thread::worker::Context::run
   7: tokio::runtime::context::set_scheduler
   8: tokio::runtime::context::runtime::enter_runtime
   9: tokio::runtime::scheduler::multi_thread::worker::run
  10: tokio::runtime::task::core::Core<T,S>::poll
  11: tokio::runtime::task::harness::Harness<T,S>::poll
  12: tokio::runtime::blocking::pool::Inner::run
  13: std::sys_common::backtrace::__rust_begin_short_backtrace
  14: core::ops::function::FnOnce::call_once{{vtable.shim}}
  15: std::sys::pal::unix::thread::Thread::new::thread_start
  16: <unknown>
  17: <unknown>
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: cake_core::cake::worker::Worker<G>::run::{{closure}}::{{closure}}
   4: tokio::runtime::task::core::Core<T,S>::poll
   5: tokio::runtime::task::harness::Harness<T,S>::poll
   6: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
   7: tokio::runtime::scheduler::multi_thread::worker::Context::run
   8: tokio::runtime::context::set_scheduler
   9: tokio::runtime::context::runtime::enter_runtime
  10: tokio::runtime::scheduler::multi_thread::worker::run
  11: tokio::runtime::task::core::Core<T,S>::poll
  12: tokio::runtime::task::harness::Harness<T,S>::poll
  13: tokio::runtime::blocking::pool::Inner::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
evilsocket commented 3 months ago

duplicate of https://github.com/evilsocket/cake/issues/13

angive commented 3 months ago

Hello, I'm also encountering the same issue. The second request causes both the worker and master to crash. image image

evilsocket commented 3 months ago

@angive https://github.com/evilsocket/cake/issues/13