MaterializeInc / materialize

The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
https://materialize.com
Other
5.71k stars 466 forks source link

sentry: panic: named collection must exist: StashError { inner: Postgres(Error { kind: Db, cause: Some(DbError {... #24037

Closed sentry-io[bot] closed 12 hours ago

sentry-io[bot] commented 8 months ago

Sentry Issue: DATABASE-BACKEND-37J

panic: named collection must exist: StashError { inner: Postgres(Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: Some(Error), code: SqlState(E40001), message: "restart transaction: TransactionRetryWithProtoRefreshError: TransactionAbortedError(ABORT_REASON_CLIENT_REJECT): \"sql txn\" meta={id=fcce891b key=/Table/132074/1 pri=100.00000000 epo=0 ts=1702930838.068998931,1 min=1702930837.508946523,0 seq=10} lock=true stat=PENDING rts=1702930837.508946523,0 wto=false gul=1702930837.758946523,0", detail: None, hint: Some("See: https://www.cockroachlabs.com/docs/v23.1/transaction-retry-error-reference.html#abort_reason_client_reject"), position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: None, line: None, routine: None }) }) }
  ?, in rust_begin_unwind
  ?, in futures_util::future::maybe_done::MaybeDone<T>::poll
  ?, in futures_util::future::poll_fn::PollFn<T>::poll
  ?, in mz_storage_controller::Controller<T>::new::{closure#0}::{closure#0}::{closure#0}
  ?, in mz_stash::postgres::Stash::with_transaction::<T>::{closure#0}::{closure#0}::{closure#0}
...
(21 additional frame(s) were not displayed)
def- commented 8 months ago

This happened with Mz cloud E2E test user 4

philip-stoev commented 8 months ago

What seems to be happening is that a transient CRDB error is considered definitive and causes a panic. @mjibson

sentry-io[bot] commented 8 months ago

Sentry issue: DATABASE-BACKEND-37Q

maddyblue commented 8 months ago

"named collection must exist" is the error being returned over a retryable SqlState(E40001) error. This could happen if the retry had been happening for over 30s, at which point the stash stops retrying. Unclear to me if "named collection must exist" is occurring because the collection doesn't actually exist (which might suggest a test infra issue) or if some CRDB problem is happening for a long time and getting bubbled up.

nrainer-materialize commented 6 months ago

This also happened in environment-0de5a2f4-33b9-f06e-cb05-23a2764e6d8b-0. https://materializeinc.sentry.io/issues/4974421899

sentry-io[bot] commented 6 months ago

Sentry Issue: DATABASE-BACKEND-39K

chaas commented 12 hours ago

Closing this as stale since we've moved to a persist-backed stash