RW cluster keep failing due to Starrocks sink error

sheltonsuen commented 6 months ago

Describe the bug

We have setup a cluster with multiple compute nodes, every time we trying to start a starrocks sink, the compute nodes keep failling one by one. Our current RW version is 1.8.

And everytime this error happen, the entir cluster keep failing other tasks will be affacted

Error message/log

2024-04-30T02:18:22.283474848Z ERROR risingwave_stream::task::stream_manager: actor exit with error actor_id=34285 error=Executor error: Sink error: Doris/Starrocks connect error: deadline has elapsed

Backtrace:

   0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/anyhow-1.0.75/src/error.rs:551:25

   1: <T as core::convert::Into<U>>::into

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/core/src/convert/mod.rs:757:9

   2: risingwave_connector::sink::doris_starrocks_connector::InserterInner::wait_handle::{{closure}}

             at ./risingwave/src/connector/src/sink/doris_starrocks_connector.rs:313:73

   3: risingwave_connector::sink::doris_starrocks_connector::InserterInner::finish::{{closure}}

             at ./risingwave/src/connector/src/sink/doris_starrocks_connector.rs:323:28

   4: risingwave_connector::sink::starrocks::StarrocksClient::finish::{{closure}}

             at ./risingwave/src/connector/src/sink/starrocks.rs:610:40

   5: <risingwave_connector::sink::starrocks::StarrocksSinkWriter as risingwave_connector::sink::writer::SinkWriter>::barrier::{{closure}}

             at ./risingwave/src/connector/src/sink/starrocks.rs:481:29

   6: <core::pin::Pin<P> as core::future::future::Future>::poll

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/core/src/future/future.rs:124:9

   7: <risingwave_connector::sink::writer::LogSinkerOf<W> as risingwave_connector::sink::LogSinker>::consume_log_and_sink::{{closure}}

             at ./risingwave/src/connector/src/sink/writer.rs:198:51

   8: <core::pin::Pin<P> as core::future::future::Future>::poll

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/core/src/future/future.rs:124:9

   9: <F as futures_core::future::TryFuture>::try_poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/future.rs:82:9

  10: <futures_util::future::try_future::try_flatten::TryFlatten<Fut,<Fut as futures_core::future::TryFuture>::Ok> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/try_future/try_flatten.rs:57:41

  11: <futures_util::future::try_future::TryFlatten<Fut1,Fut2> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/lib.rs:91:13

  12: <futures_util::future::try_future::AndThen<Fut1,Fut2,F> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/lib.rs:91:13

  13: risingwave_stream::executor::sink::SinkExecutor<F>::execute_consume_log::{{closure}}

             at ./risingwave/src/stream/src/executor/sink.rs:421:14

  14: <futures_util::stream::once::Once<Fut> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/once.rs:46:33

  15: <futures_util::future::future::IntoStream<F> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/lib.rs:102:13

  16: futures_util::stream::select_with_strategy::poll_side

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/select_with_strategy.rs:219:27

  17: futures_util::stream::select_with_strategy::poll_inner

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/select_with_strategy.rs:243:11

  18: <futures_util::stream::select_with_strategy::SelectWithStrategy<St1,St2,Clos,State> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/select_with_strategy.rs:270:17

  19: <futures_util::stream::select::Select<St1,St2> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/select.rs:115:9

  20: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9

  21: <futures_util::stream::stream::flatten::Flatten<St,<St as futures_core::stream::Stream>::Item> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/flatten.rs:50:44

  22: <futures_util::stream::stream::Flatten<St> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/lib.rs:102:13

  23: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9

  24: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9

  25: futures_util::stream::stream::StreamExt::poll_next_unpin

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9

  26: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9

  27: <await_tree::future::Instrumented<F,_> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/await-tree-0.1.1/src/future.rs:124:23

  28: risingwave_stream::executor::wrapper::trace::instrument_await_tree::{{closure}}

             at ./risingwave/src/stream/src/executor/wrapper/trace.rs:115:10

  29: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.9/src/lib.rs:506:33

  30: risingwave_stream::executor::wrapper::schema_check::schema_check::{{closure}}

             at ./risingwave/src/stream/src/executor/wrapper/schema_check.rs:24:1

  31: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.9/src/lib.rs:506:33

  32: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9

  33: futures_util::stream::stream::StreamExt::poll_next_unpin

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9

  34: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9

  35: risingwave_stream::executor::wrapper::epoch_check::epoch_check::{{closure}}

             at ./risingwave/src/stream/src/executor/wrapper/epoch_check.rs:31:44

  36: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.9/src/lib.rs:506:33

  37: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9

  38: <S as futures_core::stream::TryStream>::try_poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:196:9

  39: futures_util::stream::try_stream::TryStreamExt::try_poll_next_unpin

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/try_stream/mod.rs:1131:9

  40: <futures_util::stream::try_stream::try_next::TryNext<St> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/try_stream/try_next.rs:32:9

  41: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:347:31

  42: tokio::task::task_local::LocalKey<T>::scope_inner

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:217:19

  43: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:343:19

  44: risingwave_common::util::epoch::task_local::scope::{{closure}}

             at ./risingwave/src/common/src/util/epoch.rs:244:47

  45: risingwave_stream::executor::wrapper::epoch_provide::epoch_provide::{{closure}}

             at ./risingwave/src/stream/src/executor/wrapper/epoch_provide.rs:31:59

  46: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.9/src/lib.rs:506:33

  47: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9

  48: futures_util::stream::stream::StreamExt::poll_next_unpin

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9

  49: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9

  50: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9

  51: risingwave_stream::executor::wrapper::trace::trace::{{closure}}

             at ./risingwave/src/stream/src/executor/wrapper/trace.rs:48:69

  52: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.9/src/lib.rs:506:33

  53: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9

  54: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9

  55: <risingwave_stream::executor::dispatch::DispatchExecutor as risingwave_stream::executor::StreamConsumer>::execute::{{closure}}

             at ./risingwave/src/stream/src/executor/dispatch.rs:382:9

  56: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.9/src/lib.rs:506:33

  57: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9

  58: <&mut S as futures_core::stream::Stream>::poll_next

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:104:9

  59: <tokio_stream::stream_ext::next::Next<St> as core::future::future::Future>::poll

             at ./root/.cargo/git/checkouts/tokio-968c02b7a1a41bea/fe39bb8/tokio-stream/src/stream_ext/next.rs:42:9

  60: <tokio_stream::stream_ext::try_next::TryNext<St> as core::future::future::Future>::poll

             at ./root/.cargo/git/checkouts/tokio-968c02b7a1a41bea/fe39bb8/tokio-stream/src/stream_ext/try_next.rs:43:9

  61: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9

  62: <await_tree::future::Instrumented<F,_> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/await-tree-0.1.1/src/future.rs:124:23

  63: risingwave_stream::executor::actor::Actor<C>::run_consumer::{{closure}}

             at ./risingwave/src/stream/src/executor/actor.rs:206:18

  64: <tokio::future::maybe_done::MaybeDone<Fut> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/future/maybe_done.rs:68:48

  65: risingwave_stream::executor::actor::Actor<C>::run::{{closure}}::{{closure}}::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/macros/join.rs:126:24

  66: <tokio::future::poll_fn::PollFn<F> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/future/poll_fn.rs:58:9

  67: risingwave_stream::executor::actor::Actor<C>::run::{{closure}}::{{closure}}

             at ./risingwave/src/stream/src/executor/actor.rs:162:17

  68: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:347:31

  69: tokio::task::task_local::LocalKey<T>::scope_inner

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:217:19

  70: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:343:19

  71: risingwave_expr::expr_context::expr_context_scope::{{closure}}

             at ./risingwave/src/expr/core/src/expr_context.rs:35:65

  72: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:347:31

  73: tokio::task::task_local::LocalKey<T>::scope_inner

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:217:19

  74: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:343:19

  75: risingwave_stream::executor::actor::Actor<C>::run::{{closure}}

             at ./risingwave/src/stream/src/executor/actor.rs:170:10

  76: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/future/map.rs:55:37

  77: <futures_util::future::future::Map<Fut,F> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/lib.rs:91:13

  78: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:347:31

  79: tokio::task::task_local::LocalKey<T>::scope_inner

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:217:19

  80: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/task/task_local.rs:343:19

  81: await_tree::registry::TreeRoot::instrument::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/await-tree-0.1.1/src/registry.rs:51:45

  82: <futures_util::future::either::Either<A,B> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/either.rs:109:32

  83: core::ops::function::FnOnce::call_once

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/core/src/ops/function.rs:250:5

  84: tokio_metrics::task::instrument_poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-metrics-0.3.0/src/task.rs:2530:15

  85: <tokio_metrics::task::Instrumented<T> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-metrics-0.3.0/src/task.rs:2430:9

  86: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9

  87: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/core.rs:334:17

  88: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/loom/std/unsafe_cell.rs:16:9

  89: tokio::runtime::task::core::Core<T,S>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/core.rs:323:13

  90: tokio::runtime::task::harness::poll_future::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:485:19

  91: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/core/src/panic/unwind_safe.rs:272:9

  92: std::panicking::try::do_call

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/panicking.rs:552:40

  93: std::panicking::try

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/panicking.rs:516:19

  94: std::panic::catch_unwind

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/panic.rs:142:14

  95: tokio::runtime::task::harness::poll_future

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:473:18

  96: tokio::runtime::task::harness::Harness<T,S>::poll_inner

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:208:27

  97: tokio::runtime::task::harness::Harness<T,S>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:153:15

  98: tokio::runtime::task::raw::RawTask::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/raw.rs:200:18

  99: tokio::runtime::task::LocalNotified<S>::run

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/mod.rs:400:9

 100: tokio::runtime::scheduler::multi_thread::worker::Context::run_task::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:576:18

 101: tokio::runtime::coop::with_budget

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/coop.rs:107:5

 102: tokio::runtime::coop::budget

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/coop.rs:73:5

 103: tokio::runtime::scheduler::multi_thread::worker::Context::run_task

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:575:9

 104: tokio::runtime::scheduler::multi_thread::worker::Context::run

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:526:24

 105: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:491:21

 106: tokio::runtime::context::scoped::Scoped<T>::set

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/context/scoped.rs:40:9

 107: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:486:9

 108: tokio::runtime::context::runtime::enter_runtime

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/context/runtime.rs:65:16

 109: tokio::runtime::scheduler::multi_thread::worker::run

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:478:5

 110: tokio::runtime::scheduler::multi_thread::worker::Launch::launch::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/scheduler/multi_thread/worker.rs:447:45

 111: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/blocking/task.rs:42:21

 112: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9

 113: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/core.rs:334:17

 114: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/loom/std/unsafe_cell.rs:16:9

 115: tokio::runtime::task::core::Core<T,S>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/core.rs:323:13

 116: tokio::runtime::task::harness::poll_future::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:485:19

 117: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/core/src/panic/unwind_safe.rs:272:9

 118: std::panicking::try::do_call

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/panicking.rs:552:40

 119: std::panicking::try

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/panicking.rs:516:19

 120: std::panic::catch_unwind

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/panic.rs:142:14

 121: tokio::runtime::task::harness::poll_future

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:473:18

 122: tokio::runtime::task::harness::Harness<T,S>::poll_inner

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:208:27

 123: tokio::runtime::task::harness::Harness<T,S>::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/harness.rs:153:15

 124: tokio::runtime::task::raw::RawTask::poll

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/raw.rs:200:18

 125: tokio::runtime::task::UnownedTask<S>::run

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/task/mod.rs:437:9

 126: tokio::runtime::blocking::pool::Task::run

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/blocking/pool.rs:159:9

 127: tokio::runtime::blocking::pool::Inner::run

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/blocking/pool.rs:513:17

 128: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}

             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.32.0/src/runtime/blocking/pool.rs:471:13

 129: std::sys_common::backtrace::__rust_begin_short_backtrace

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/sys_common/backtrace.rs:155:18

 130: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/thread/mod.rs:529:17

 131: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/core/src/panic/unwind_safe.rs:272:9

 132: std::panicking::try::do_call

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/panicking.rs:552:40

 133: std::panicking::try

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/panicking.rs:516:19

 134: std::panic::catch_unwind

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/panic.rs:142:14

 135: std::thread::Builder::spawn_unchecked_::{{closure}}

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/thread/mod.rs:528:30

 136: core::ops::function::FnOnce::call_once{{vtable.shim}}

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/core/src/ops/function.rs:250:5

 137: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/alloc/src/boxed.rs:2015:9

 138: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/alloc/src/boxed.rs:2015:9

 139: std::sys::unix::thread::Thread::new::thread_start

             at ./rustc/e4c626dd9a17a23270bf8e7158e59cf2b9c04840/library/std/src/sys/unix/thread.rs:108:17

 140: start_thread

             at ./nptl/pthread_create.c:442:8

 141: __GI___clone

             at ./misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:100

To Reproduce

No response

Expected behavior

No response

How did you deploy RisingWave?

Helm charts

The version of RisingWave

1.8

Additional context

No response

Pandas886 commented 6 months ago

same error happened

xxhZs commented 6 months ago

What is the version of sr and what are the specific sql statements?

sheltonsuen commented 6 months ago

RisingWave: 1.8 SR version: 3.1 with shared data arch

Pandas886 commented 6 months ago

It may be caused by getting the internal host from the SR front end when accessing the SR back end.

sheltonsuen commented 6 months ago

It may be caused by getting the internal host from the SR front end when accessing the SR back end.

what's that mean? I'm not quit understand

xxhZs commented 6 months ago

It is indeed caused by accessing starrocks http port, can you please provide the specific host and port of sr's fe and cn nodes?

sheltonsuen commented 6 months ago

you mean this?

host: fe: starrocks-fe-service.dp-prod cn: starrocks-cn-service.dp-prod

xxhZs commented 6 months ago

you mean this?

host: fe: starrocks-fe-service.dp-prod cn: starrocks-cn-service.dp-prod

okk, thanks, can you test the connectivity between the rw cn node and the sr cn node direct network?

sheltonsuen commented 6 months ago

you mean this? host: fe: starrocks-fe-service.dp-prod cn: starrocks-cn-service.dp-prod

okk, thanks, can you test the connectivity between the rw cn node and the sr cn node direct network?

connectivity is good, and actually the SR sink is fine in dev enviorment, only error in prod, maybe it's a local env problem, I will test it in prod later this weekend

sheltonsuen commented 6 months ago

I have tested this case in dev/prod envs, still have this kind error, sames like writing to SR is too slow, and then RW cn nodes keep failling, some time sink to mysql will also cause this kind error and it will recover from failling automaticly

sheltonsuen commented 6 months ago

e2.log e1.log e0.log

github-actions[bot] commented 4 months ago

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean. Don't worry if you think the issue is still valuable to continue in the future. It's searchable and can be reopened when it's time. 😄

sheltonsuen commented 4 months ago

this can be resolved by add sink_decouple param

risingwavelabs / risingwave