MaterializeInc / materialize

The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
https://materialize.com
Other
5.72k stars 466 forks source link

0dt: thread 'coordinator' panicked at src/adapter/src/coord.rs:2835:13: `write_frontier` unexpectedly greater than `max_as_of` #29091

Open def- opened 1 month ago

def- commented 1 month ago

What version of Materialize are you using?

v0.113.2 (e9f4035263)

What is the issue?

Seen in Checks 0dt restart of the entire Mz:

thread 'coordinator' panicked at src/adapter/src/coord.rs:2835:13:
`write_frontier` unexpectedly greater than `max_as_of`
stack backtrace:
   0: rust_begin_unwind
             at ./rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/std/src/panicking.rs:652:5
   1: core::panicking::panic_fmt
             at ./rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/panicking.rs:72:14
   2: <mz_adapter::coord::Coordinator>::bootstrap_dataflow_as_of
             at ./var/lib/buildkite-agent/builds/buildkite-builders-aarch64-585fc7f-i-0014b150bea6cf180-1/materialize/test/src/adapter/src/coord.rs:2835:13
   3: <mz_adapter::coord::Coordinator>::bootstrap::{closure#0}::{closure#0}
             at ./var/lib/buildkite-agent/builds/buildkite-builders-aarch64-585fc7f-i-0014b150bea6cf180-1/materialize/test/src/adapter/src/coord.rs:2038:47
   4: <tracing::instrument::Instrumented<<mz_adapter::coord::Coordinator>::bootstrap::{closure#0}::{closure#0}> as core::future::future::Future>::poll
             at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.37/src/instrument.rs:272:9
   5: <mz_adapter::coord::Coordinator>::bootstrap::{closure#0}
             at ./var/lib/buildkite-agent/builds/buildkite-builders-aarch64-585fc7f-i-0014b150bea6cf180-1/materialize/test/src/adapter/src/coord.rs:1740:5
   6: mz_adapter::coord::serve::{closure#0}::{closure#2}::{closure#0}
             at ./var/lib/buildkite-agent/builds/buildkite-builders-aarch64-585fc7f-i-0014b150bea6cf180-1/materialize/test/src/adapter/src/coord.rs:3910:26
   7: <tracing::instrument::Instrumented<mz_adapter::coord::serve::{closure#0}::{closure#2}::{closure#0}> as core::future::future::Future>::poll
             at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.37/src/instrument.rs:272:9
   8: <tokio::runtime::park::CachedParkThread>::block_on::<tracing::instrument::Instrumented<mz_adapter::coord::serve::{closure#0}::{closure#2}::{closure#0}>>::{closure#0}
             at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/park.rs:281:63
   9: tokio::runtime::coop::with_budget::<core::task::poll::Poll<core::result::Result<(), mz_adapter::error::AdapterError>>, <tokio::runtime::park::CachedParkThread>::block_on<tracing::instrument::Instrumented<mz_adapter::coord::serve::{closure#0}::{closure#2}::{closure#0}>>::{closure#0}>
             at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/coop.rs:107:5
  10: tokio::runtime::coop::budget::<core::task::poll::Poll<core::result::Result<(), mz_adapter::error::AdapterError>>, <tokio::runtime::park::CachedParkThread>::block_on<tracing::instrument::Instrumented<mz_adapter::coord::serve::{closure#0}::{closure#2}::{closure#0}>>::{closure#0}>
             at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/coop.rs:73:5
  11: <tokio::runtime::park::CachedParkThread>::block_on::<tracing::instrument::Instrumented<mz_adapter::coord::serve::{closure#0}::{closure#2}::{closure#0}>>
             at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/park.rs:281:31
  12: <tokio::runtime::context::blocking::BlockingRegionGuard>::block_on::<tracing::instrument::Instrumented<mz_adapter::coord::serve::{closure#0}::{closure#2}::{closure#0}>>
             at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context/blocking.rs:66:9
  13: <tokio::runtime::handle::Handle>::block_on::<mz_adapter::coord::serve::{closure#0}::{closure#2}::{closure#0}>::{closure#0}
             at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/handle.rs:310:22
  14: tokio::runtime::context::runtime::enter_runtime::<<tokio::runtime::handle::Handle>::block_on<mz_adapter::coord::serve::{closure#0}::{closure#2}::{closure#0}>::{closure#0}, core::result::Result<(), mz_adapter::error::AdapterError>>
             at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context/runtime.rs:65:16
  15: <tokio::runtime::handle::Handle>::block_on::<mz_adapter::coord::serve::{closure#0}::{closure#2}::{closure#0}>
             at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/handle.rs:309:9
  16: mz_adapter::coord::serve::{closure#0}::{closure#2}
             at ./var/lib/buildkite-agent/builds/buildkite-builders-aarch64-585fc7f-i-0014b150bea6cf180-1/materialize/test/src/adapter/src/coord.rs:3903:33
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

This is probably a flake since it hasn't occurred on main, but I retriggered just in case: https://buildkite.com/materialize/nightly/builds/9151#01915d3d-de84-4d90-858f-19da1d7d0a62

ci-regexp: `write_frontier` unexpectedly greater than `max_as_of`
aljoscha commented 3 weeks ago

I have a very strong suspicion that this is a variant of https://github.com/MaterializeInc/materialize/issues/28885#issuecomment-2293254613, but for materialized views:

We don't know the ID of the MV in question, and we don't know the since/upper/max_as_of, so I've cut https://github.com/MaterializeInc/materialize/pull/29111 to help illuminate this.

cc @benesch because you were also in the loop on that other issue

def- commented 3 weeks ago

We'll be able to check in CI Failures when this reoccurs with more details.