MaterializeInc / materialize

The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
https://materialize.com
Other
5.71k stars 465 forks source link

ALTER ... OWNER TO fails to return back to the client #21317

Closed philip-stoev closed 8 months ago

philip-stoev commented 1 year ago

What version of Materialize are you using?

d22daebdcde41825ce3d6a6451f50017c1378abc

What is the issue?

A concurrent workload that runs a lot of non-conflicting DDLs frequently hangs like this:

>> ALTER CLUSTER owner_cluster2 OWNER TO other_owner
...

That is, testdrive attempts to run this particular statement and the statement never returns. It is always the same statement that hangs, even if the workload is randomized. Logging in manually in the Mz instance works and the statement can be run to completion via the psql client.

To reproduce:

  1. Make sure you do not have https://github.com/MaterializeInc/materialize/pull/21320 in your branch.

  2. Use the following scratch configuration file:

pstoev@Ubuntu-2004-focal-64-minimal:~/materialize-bisect$ cat misc/scratch/ci.json
{
    "name": "m6a.8xlarge as used by the CI",
    "launch_script": "true",
    "instance_type": "m6a.8xlarge",
    "ami": "ami-09d56f8956ab235b3",
    "size_gb": 500,
    "tags": {}
}
  1. Then run:
while bash -c -e "./mzcompose down -v ; ./mzcompose run default --scenario=RestartEntireMz --execution-mode=parallel" ;  do : ; done

and give it less than 1 hour

@jkosh44 FYI

philip-stoev commented 1 year ago

Please revert https://github.com/MaterializeInc/materialize/pull/21320 once this is fixed.

maddyblue commented 1 year ago

A thought: make sure the ExecuteContext is correctly being plumbed in that operation. Maybe alter cluster owner was implemented around the same time as ExecuteContext and so there was a logical merge skew and it gets incorrectly dropped or ignored.

philip-stoev commented 1 year ago

Actually, other ALTER OWNER TO may be implicated, as disabling the one that acts against the cluster did not stop the issue from happening.

21320 has been reverted and is replaced with https://github.com/MaterializeInc/materialize/pull/21385 . When fixed, please adjust the _can_run() check in owners.py to allow running the check in parallel mode.

jkosh44 commented 1 year ago

When trying to reproduce this locally by running:

bin/mzcompose --find platform-checks down ; bin/mzcompose --find platform-checks --dev run default --scenario=RestartEntireMz --check=Owners --execution-mode=parallel; bin/mzcompose --find platform-checks logs > out.txt ; bin/mzcompose --find platform-checks down ;  grep materialized out.txt | less

(note: I'm only running the Owners check)

I sometimes see the following error:

platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:47:52.551261Z  INFO mz_storage::sink::kafka: kafka-u22: sending progress for gate ts: 1693252069314
platform-checks-materialized-1  | thread 'producer polling thread' panicked at 'attempt to subtract with overflow', src/storage/src/sink/kafka.rs:213:9
platform-checks-materialized-1  | stack backtrace:
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:53.312216Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252070538
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:53.335351Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252070538
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:54.040056Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252071425
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:54.054154Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252071425
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:55.039129Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252072402
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:55.053539Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252072402
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:56.049484Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252073342
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:56.063391Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252073342
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:56.876637Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252074349
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:56.894131Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252074349
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:58.110575Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252075406
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:58.130993Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252075406
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:59.371132Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252076609
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:59.421257Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252076609
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:47:59.952295Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252077741
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:00.003385Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252077741
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:01.680042Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252079903
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:01.698625Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252079903
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:03.542973Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252081928
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:03.569915Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252081928
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:04.571159Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252082288
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:04.587886Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252082288
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:06.090894Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252084047
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:06.105156Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252084047
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:06.838616Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252084404
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:06.857901Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252084404
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:07.548764Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252085274
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:07.563111Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252085274
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:08.757219Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252086270
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:08.771055Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252086270
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:09.583441Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252087269
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:09.598973Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252087269
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:10.337468Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252088308
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:10.352475Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252088308
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:11.866270Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252089416
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:11.880411Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252089416
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:12.774290Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252090266
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:12.791427Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252090266
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:14.194369Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252091446
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:14.207545Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252091446
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:15.136182Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252092269
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:15.150337Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252092269
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:15.692525Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252093638
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:15.706695Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252093638
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:16.145805Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252094054
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:16.172699Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252094054
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:16.858586Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252095326
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:16.872744Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252095326
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:19.010883Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252096931
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:19.031142Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252096931
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:20.514561Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252097793
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:20.529878Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252097793
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:20.849243Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252099196
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:20.864270Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252099196
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:21.572464Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252099720
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:21.590166Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252099720
platform-checks-materialized-1  |    0: rust_begin_unwind
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:593:5
platform-checks-materialized-1  |    1: core::panicking::panic_fmt
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panicking.rs:67:14
platform-checks-materialized-1  |    2: core::panicking::panic
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panicking.rs:117:5
platform-checks-materialized-1  |    3: <mz_storage::sink::kafka::KafkaSinkSendRetryManager>::record_success
platform-checks-materialized-1  |              at ./home/joe/materialize/src/storage/src/sink/kafka.rs:213:9
platform-checks-materialized-1  |    4: <mz_storage::sink::kafka::SinkProducerContext as rdkafka::producer::ProducerContext>::delivery
platform-checks-materialized-1  |              at ./home/joe/materialize/src/storage/src/sink/kafka.rs:244:22
platform-checks-materialized-1  |    5: <mz_kafka_util::client::BrokerRewritingClientContext<mz_storage::sink::kafka::SinkProducerContext> as rdkafka::producer::ProducerContext>::delivery
platform-checks-materialized-1  |              at ./home/joe/materialize/src/kafka-util/src/client.rs:219:9
platform-checks-materialized-1  |    6: rdkafka::producer::base_producer::delivery_cb::<mz_kafka_util::client::BrokerRewritingClientContext<mz_storage::sink::kafka::SinkProducerContext>>
platform-checks-materialized-1  |              at ./cargo/git/checkouts/rust-rdkafka-545566655b063b24/8ea07c4/src/producer/base_producer.rs:83:5
platform-checks-materialized-1  |    7: rd_kafka_poll_cb
platform-checks-materialized-1  |              at ./cargo/git/checkouts/rust-rdkafka-545566655b063b24/8ea07c4/rdkafka-sys/librdkafka/src/rdkafka.c:3837:33
platform-checks-materialized-1  |    8: rd_kafka_op_handle
platform-checks-materialized-1  |              at ./cargo/git/checkouts/rust-rdkafka-545566655b063b24/8ea07c4/rdkafka-sys/librdkafka/src/rdkafka_op.c:872:23
platform-checks-materialized-1  |    9: rd_kafka_q_serve
platform-checks-materialized-1  |              at ./cargo/git/checkouts/rust-rdkafka-545566655b063b24/8ea07c4/rdkafka-sys/librdkafka/src/rdkafka_queue.c:513:23
platform-checks-materialized-1  |   10: rd_kafka_poll
platform-checks-materialized-1  |              at ./cargo/git/checkouts/rust-rdkafka-545566655b063b24/8ea07c4/rdkafka-sys/librdkafka/src/rdkafka.c:3977:13
platform-checks-materialized-1  |   11: <rdkafka::producer::base_producer::BaseProducer<mz_kafka_util::client::BrokerRewritingClientContext<mz_storage::sink::kafka::SinkProducerContext>>>::poll::<core::time::Duration>
platform-checks-materialized-1  |              at ./cargo/git/checkouts/rust-rdkafka-545566655b063b24/8ea07c4/src/producer/base_producer.rs:297:18
platform-checks-materialized-1  |   12: <rdkafka::producer::base_producer::ThreadedProducer<mz_kafka_util::client::BrokerRewritingClientContext<mz_storage::sink::kafka::SinkProducerContext>> as rdkafka::config::FromClientConfigAndContext<mz_kafka_util::client::BrokerRewritingClientContext<mz_storage::sink::kafka::SinkProducerContext>>>::from_config_and_context::{closure#0}
platform-checks-materialized-1  |              at ./cargo/git/checkouts/rust-rdkafka-545566655b063b24/8ea07c4/src/producer/base_producer.rs:530:33
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:23.177511Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252101216
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:23.192278Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252101216
platform-checks-materialized-1  | note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:24.866196Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252102108
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:24.884261Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252102108
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:25.143832Z  WARN mz_timely_util::panic: halting process: timely communication error: reading data: socket closed
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:25.144207Z  WARN mz_timely_util::panic: halting process: timely communication error: reading data: Connection reset by peer (os error 104)
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:25.144643Z  WARN mz_timely_util::panic: halting process: timely communication error: reading data: socket closed
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:25.147254Z  WARN mz_timely_util::panic: halting process: timely communication error: reading data: socket closed
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.148577Z  WARN mz_compute_client::controller::replica: replica task failed: status: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", details: [], metadata: MetadataMap { headers: {} } replica=User(6)
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.150870Z  WARN mz_storage_client::controller::rehydration: storage cluster produced error, reconnecting: status: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", details: [], metadata: MetadataMap { headers: {} }
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.158672Z  WARN persist::rpc::server:connection{caller_id="53d3b6640c12"}: mz_persist_client::rpc: pubsub connection err: status: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", details: [], metadata: MetadataMap { headers: {} }
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.158916Z  INFO persist::rpc::server:connection{caller_id="53d3b6640c12"}: mz_persist_client::rpc: Persist PubSub connection ended: "53d3b6640c12"
platform-checks-materialized-1  | thread 'tokio:work-5' panicked at 'cluster-u6-replica-u6-2 crashed; aborting because propagate_crashes is enabled', src/orchestrator-process/src/lib.rs:604:29
platform-checks-materialized-1  | stack backtrace:
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.166353Z  WARN persist::rpc::server:connection{caller_id="53d3b6640c12"}: mz_persist_client::rpc: pubsub connection err: status: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", details: [], metadata: MetadataMap { headers: {} }
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.166966Z  INFO persist::rpc::server:connection{caller_id="53d3b6640c12"}: mz_persist_client::rpc: Persist PubSub connection ended: "53d3b6640c12"
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.178857Z ERROR mz_orchestrator_process: cluster-u6-replica-u6-1 exited: ExitStatus(unix_wait_status(512)); relaunching in 5s
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.180087Z  WARN persist::rpc::server:connection{caller_id="53d3b6640c12"}: mz_persist_client::rpc: pubsub connection err: status: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", details: [], metadata: MetadataMap { headers: {} }
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.180399Z  INFO persist::rpc::server:connection{caller_id="53d3b6640c12"}: mz_persist_client::rpc: Persist PubSub connection ended: "53d3b6640c12"
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.181596Z ERROR mz_orchestrator_process: cluster-u6-replica-u6-0 exited: ExitStatus(unix_wait_status(512)); relaunching in 5s
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.192164Z  WARN persist::rpc::server:connection{caller_id="53d3b6640c12"}: mz_persist_client::rpc: pubsub connection err: status: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", details: [], metadata: MetadataMap { headers: {} }
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.192319Z  INFO persist::rpc::server:connection{caller_id="53d3b6640c12"}: mz_persist_client::rpc: Persist PubSub connection ended: "53d3b6640c12"
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:25.196260Z ERROR mz_orchestrator_process: cluster-u6-replica-u6-3 exited: ExitStatus(unix_wait_status(512)); relaunching in 5s
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:25.866679Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252102738
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:25.882919Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252102738
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:26.823455Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252103844
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:26.837590Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252103844
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:27.525223Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252104719
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:27.541235Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252104719
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:28.037072Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:28.289480Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252105997
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:28.305046Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252105997
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:29.037465Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:29.498453Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252107084
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:29.512581Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252107084
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:30.039172Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:30.180531Z  INFO mz_orchestrator_process: launching cluster-u6-replica-u6-1 via /usr/local/bin/clusterd --storage-controller-listen-addr=/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0 --compute-controller-listen-addr=/tmp/9728f6041a7598083bf8b0e8494f0a085cb463fa --internal-http-listen-addr=/tmp/fd6fcaf6b4d7491458c27701beddc4aa810db92d --opentelemetry-resource=cluster_id=u6 --opentelemetry-resource=replica_id=u6 --persist-pubsub-url=http://localhost:6879 --secrets-reader=local-file --secrets-reader-local-file-dir=/mzdata/secrets --log-format=text --log-prefix=cluster-u6-replica-u6 --pid-file-location=/tmp/environmentd-mzcompose-us-east-1-00000000-0000-0000-0000-000000000000-0/cluster-u6-replica-u6/1.pid...
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:30.188093Z  INFO mz_orchestrator_process: launching cluster-u6-replica-u6-0 via /usr/local/bin/clusterd --storage-controller-listen-addr=/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac --compute-controller-listen-addr=/tmp/40d1bcd26950bb64d8b5ded4cdde469b83ba6ef3 --internal-http-listen-addr=/tmp/100848d764245ecf98be8faecf67a0a25d502a59 --opentelemetry-resource=cluster_id=u6 --opentelemetry-resource=replica_id=u6 --persist-pubsub-url=http://localhost:6879 --secrets-reader=local-file --secrets-reader-local-file-dir=/mzdata/secrets --log-format=text --log-prefix=cluster-u6-replica-u6 --pid-file-location=/tmp/environmentd-mzcompose-us-east-1-00000000-0000-0000-0000-000000000000-0/cluster-u6-replica-u6/0.pid...
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:30.196889Z  INFO mz_orchestrator_process: launching cluster-u6-replica-u6-3 via /usr/local/bin/clusterd --storage-controller-listen-addr=/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd --compute-controller-listen-addr=/tmp/7e38983a091849d835bb95304a4589f8aa93df74 --internal-http-listen-addr=/tmp/17bd158d3782d17749c7b82a8858a79413c97df6 --opentelemetry-resource=cluster_id=u6 --opentelemetry-resource=replica_id=u6 --persist-pubsub-url=http://localhost:6879 --secrets-reader=local-file --secrets-reader-local-file-dir=/mzdata/secrets --log-format=text --log-prefix=cluster-u6-replica-u6 --pid-file-location=/tmp/environmentd-mzcompose-us-east-1-00000000-0000-0000-0000-000000000000-0/cluster-u6-replica-u6/3.pid...
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.501382Z  INFO clusterd: booting os.os_type=Ubuntu os.version=22.04 os.bitness=64-bit build.version="0.68.0-dev" build.sha="7eb9dd0327b01aa2a5872199457618a94bf319c8" build.time="2023-08-28T19:08:38Z" cpus.logical=16 cpus.physical=8 cpu0.brand="Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz" cpu0.frequency=1367 memory.total=67122192384 memory.used=18727825408 memory.limit=<unknown> swap.total=2146955264 swap.used=58195968 swap.limit=<unknown> tracing.max_level=info
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.508102Z  INFO clusterd: serving internal HTTP server on /tmp/17bd158d3782d17749c7b82a8858a79413c97df6
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.520058Z  INFO clusterd: booting os.os_type=Ubuntu os.version=22.04 os.bitness=64-bit build.version="0.68.0-dev" build.sha="7eb9dd0327b01aa2a5872199457618a94bf319c8" build.time="2023-08-28T19:08:38Z" cpus.logical=16 cpus.physical=8 cpu0.brand="Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz" cpu0.frequency=1299 memory.total=67122192384 memory.used=18728341504 memory.limit=<unknown> swap.total=2146955264 swap.used=58195968 swap.limit=<unknown> tracing.max_level=info
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.523271Z  INFO clusterd: serving internal HTTP server on /tmp/fd6fcaf6b4d7491458c27701beddc4aa810db92d
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.551087Z  INFO mz_persist_client::rpc: Connecting to Persist PubSub: http://localhost:6879
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.555519Z  INFO mz_persist_client::rpc: Connected to Persist PubSub: http://localhost:6879
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:30.560601Z  INFO persist::rpc::server: mz_persist_client::rpc: Received Persist PubSub connection from: "53d3b6640c12"
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.569605Z  INFO clusterd: listening for storage controller connections on /tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.570337Z  INFO mz_service::grpc: Starting to listen on /tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.570543Z  INFO clusterd: listening for compute controller connections on /tmp/7e38983a091849d835bb95304a4589f8aa93df74
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.571393Z  INFO mz_service::grpc: Starting to listen on /tmp/7e38983a091849d835bb95304a4589f8aa93df74
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.582723Z  INFO clusterd: booting os.os_type=Ubuntu os.version=22.04 os.bitness=64-bit build.version="0.68.0-dev" build.sha="7eb9dd0327b01aa2a5872199457618a94bf319c8" build.time="2023-08-28T19:08:38Z" cpus.logical=16 cpus.physical=8 cpu0.brand="Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz" cpu0.frequency=1299 memory.total=67122192384 memory.used=18730926080 memory.limit=<unknown> swap.total=2146955264 swap.used=58195968 swap.limit=<unknown> tracing.max_level=info
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.588210Z  INFO clusterd: serving internal HTTP server on /tmp/100848d764245ecf98be8faecf67a0a25d502a59
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.680222Z  INFO mz_persist_client::rpc: Connecting to Persist PubSub: http://localhost:6879
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.687127Z  INFO mz_persist_client::rpc: Connected to Persist PubSub: http://localhost:6879
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:30.700401Z  INFO persist::rpc::server: mz_persist_client::rpc: Received Persist PubSub connection from: "53d3b6640c12"
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.702081Z  INFO mz_persist_client::rpc: Connecting to Persist PubSub: http://localhost:6879
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.706440Z  INFO mz_persist_client::rpc: Connected to Persist PubSub: http://localhost:6879
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:30.711279Z  INFO persist::rpc::server: mz_persist_client::rpc: Received Persist PubSub connection from: "53d3b6640c12"
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:30.723977Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252108213
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.724102Z  INFO clusterd: listening for storage controller connections on /tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.725115Z  INFO mz_service::grpc: Starting to listen on /tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.725439Z  INFO clusterd: listening for compute controller connections on /tmp/9728f6041a7598083bf8b0e8494f0a085cb463fa
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.726107Z  INFO mz_service::grpc: Starting to listen on /tmp/9728f6041a7598083bf8b0e8494f0a085cb463fa
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.729490Z  INFO clusterd: listening for storage controller connections on /tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.729844Z  INFO mz_service::grpc: Starting to listen on /tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.731688Z  INFO clusterd: listening for compute controller connections on /tmp/40d1bcd26950bb64d8b5ded4cdde469b83ba6ef3
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:30.731879Z  INFO mz_service::grpc: Starting to listen on /tmp/40d1bcd26950bb64d8b5ded4cdde469b83ba6ef3
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:30.743203Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252108213
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:31.041605Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:31.890854Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252110737
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:31.909557Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252110737
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:32.052060Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:33.111842Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:33.478324Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252111819
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:33.503004Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252111819
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:34.068737Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:34.785200Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252113474
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:34.819909Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252113474
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:35.070026Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:35.546770Z  INFO mz_compute_client::controller::replica: starting replica task replica=User(6)
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:36.100016Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:36.822492Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252115807
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:36.836322Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252115807
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:37.102647Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:37.403914Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252116614
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:37.425734Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252116614
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:38.119252Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:38.436763Z  INFO mz_compute_client::controller::replica: error connecting to replica, retrying in 1s: transport error replica=User(6)
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:39.119679Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:39.349710Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252117828
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:39.400449Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252117828
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:39.437691Z  INFO mz_compute_client::controller::replica: error connecting to replica, retrying in 1s: transport error replica=User(6)
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:40.121763Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:40.441439Z  INFO mz_compute_client::controller::replica: error connecting to replica, retrying in 1s: transport error replica=User(6)
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:40.550408Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252118935
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:40.568101Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252118935
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:41.125213Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:41.443112Z  INFO mz_compute_client::controller::replica: error connecting to replica, retrying in 1s: transport error replica=User(6)
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:41.664531Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252119828
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:41.679067Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252119828
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:42.126840Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:42.448093Z  INFO mz_compute_client::controller::replica: error connecting to replica, retrying in 1s: transport error replica=User(6)
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:42.736693Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252120958
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:42.772600Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252120958
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:43.128008Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  | environmentd: 2023-08-28T19:48:43.446515Z  INFO mz_compute_client::controller::replica: error connecting to replica, retrying in 1s: transport error replica=User(6)
platform-checks-materialized-1  |    0: rust_begin_unwind
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:593:5
platform-checks-materialized-1  |    1: core::panicking::panic_fmt
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panicking.rs:67:14
platform-checks-materialized-1  |    2: <mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}
platform-checks-materialized-1  |              at ./home/joe/materialize/src/orchestrator-process/src/lib.rs:604:29
platform-checks-materialized-1  |    3: <core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>> as core::future::future::Future>::poll
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/future/future.rs:125:9
platform-checks-materialized-1  |    4: <tokio::runtime::task::core::Core<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll::{closure#0}
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/core.rs:223:17
platform-checks-materialized-1  |    5: <tokio::loom::std::unsafe_cell::UnsafeCell<tokio::runtime::task::core::Stage<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>>>>::with_mut::<core::task::poll::Poll<()>, <tokio::runtime::task::core::Core<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll::{closure#0}>
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/loom/std/unsafe_cell.rs:14:9
platform-checks-materialized-1  |    6: <tokio::runtime::task::core::Core<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/core.rs:212:13
platform-checks-materialized-1  |    7: tokio::runtime::task::harness::poll_future::<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>::{closure#0}
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/harness.rs:476:19
platform-checks-materialized-1  |    8: <core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>::{closure#0}> as core::ops::function::FnOnce<()>>::call_once
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panic/unwind_safe.rs:271:9
platform-checks-materialized-1  |  environmentd: 2023-08-28T19:48:44.140209Z  INFO mz_storage_client::controller::rehydration: error connecting to ClusterReplicaLocation { ctl_addrs: ["/tmp/eb994b24c7d068ac4adbd6491e0e3b4d3021e6ac", "/tmp/1eec1bd4aeb67f8f512095b2893953f1cded4fb0", "/tmp/34836116d75f07063aa6f80cf3401e428cbe7ec3", "/tmp/abc89878e29ef8231fa1dfbd669c753d7673b2fd"], dataflow_addrs: ["/tmp/8ea6eb9ac47364dc3132e62c7350ec933e61b034", "/tmp/552eb9cee864c26979e2850e102d0be443bbf30a", "/tmp/cd7121a52fa1aa245414a01fe93d478c8f472192", "/tmp/3e77fd132da28a59318b3bcc1797b392a490dd99"], workers: 1 } for storage, retrying in 1s: transport error
platform-checks-materialized-1  |   9: std::panicking::try::do_call::<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>::{closure#0}>, core::task::poll::Poll<()>>
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:500:40
platform-checks-materialized-1  |   10: __rust_try
platform-checks-materialized-1  |   11: std::panicking::try::<core::task::poll::Poll<()>, core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>::{closure#0}>>
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:464:19
platform-checks-materialized-1  |   12: std::panic::catch_unwind::<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>::{closure#0}>, core::task::poll::Poll<()>>
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panic.rs:142:14
platform-checks-materialized-1  |   13: tokio::runtime::task::harness::poll_future::<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/harness.rs:464:18
platform-checks-materialized-1  |   14: <tokio::runtime::task::harness::Harness<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll_inner
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/harness.rs:198:27
platform-checks-materialized-1  |   15: <tokio::runtime::task::harness::Harness<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/harness.rs:152:15
platform-checks-materialized-1  |   16: tokio::runtime::task::raw::poll::<core::pin::Pin<alloc::boxed::Box<<mz_orchestrator_process::NamespacedProcessOrchestrator>::supervise_service_process::{closure#1}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/raw.rs:255:5
platform-checks-materialized-1  |   17: <tokio::runtime::task::raw::RawTask>::poll
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/raw.rs:200:18
platform-checks-materialized-1  |   18: <tokio::runtime::task::LocalNotified<alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::run
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/mod.rs:394:9
platform-checks-materialized-1  |   19: <tokio::runtime::scheduler::multi_thread::worker::Context>::run_task::{closure#0}
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/scheduler/multi_thread/worker.rs:464:13
platform-checks-materialized-1  |   20: tokio::runtime::coop::with_budget::<core::result::Result<alloc::boxed::Box<tokio::runtime::scheduler::multi_thread::worker::Core>, ()>, <tokio::runtime::scheduler::multi_thread::worker::Context>::run_task::{closure#0}>
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/coop.rs:107:5
platform-checks-materialized-1  |   21: tokio::runtime::coop::budget::<core::result::Result<alloc::boxed::Box<tokio::runtime::scheduler::multi_thread::worker::Core>, ()>, <tokio::runtime::scheduler::multi_thread::worker::Context>::run_task::{closure#0}>
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/coop.rs:73:5
platform-checks-materialized-1  |   22: <tokio::runtime::scheduler::multi_thread::worker::Context>::run_task
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/scheduler/multi_thread/worker.rs:463:9
platform-checks-materialized-1  |   23: <tokio::runtime::scheduler::multi_thread::worker::Context>::run
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/scheduler/multi_thread/worker.rs:426:24
platform-checks-materialized-1  |   24: tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/scheduler/multi_thread/worker.rs:406:17
platform-checks-materialized-1  |   25: <tokio::macros::scoped_tls::ScopedKey<tokio::runtime::scheduler::multi_thread::worker::Context>>::set::<tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}, ()>
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/macros/scoped_tls.rs:61:9
platform-checks-materialized-1  |   26: tokio::runtime::scheduler::multi_thread::worker::run
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/scheduler/multi_thread/worker.rs:403:5
platform-checks-materialized-1  |   27: <tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/scheduler/multi_thread/worker.rs:365:45
platform-checks-materialized-1  |   28: <tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}> as core::future::future::Future>::poll
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/blocking/task.rs:42:21
platform-checks-materialized-1  |   29: <tokio::runtime::task::core::Core<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>>::poll::{closure#0}
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/core.rs:223:17
platform-checks-materialized-1  |   30: <tokio::loom::std::unsafe_cell::UnsafeCell<tokio::runtime::task::core::Stage<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>>>::with_mut::<core::task::poll::Poll<()>, <tokio::runtime::task::core::Core<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>>::poll::{closure#0}>
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/loom/std/unsafe_cell.rs:14:9
platform-checks-materialized-1  |   31: <tokio::runtime::task::core::Core<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::cluster-u3-replica-u3: 2023-08-28T19:48:44.275711Z  INFO mz_storage::sink::kafka: kafka-u11: sending progress for gate ts: 1693252122471
platform-checks-materialized-1  | BlockingSchedule>>::poll
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/core.rs:212:13
platform-checks-materialized-1  |   32: tokio::runtime::task::harness::poll_future::<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>::{closure#0}
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/harness.rs:476:19
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:44.289572Z  INFO mz_storage::sink::kafka: kafka-u11: downgrading write frontier to: 1693252122471
platform-checks-materialized-1  |   33: <core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>::{closure#0}> as core::ops::function::FnOnce<()>>::call_once
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panic/unwind_safe.rs:271:9
platform-checks-materialized-1  |   34: std::panicking::try::do_call::<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>::{closure#0}>, core::task::poll::Poll<()>>
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:500:40
platform-checks-materialized-1  |   35: __rust_try
platform-checks-materialized-1  |   36: std::panicking::try::<core::task::poll::Poll<()>, core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>::{closure#0}>>
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:464:19
platform-checks-materialized-1  |   37: std::panic::catch_unwind::<core::panic::unwind_safe::AssertUnwindSafe<tokio::runtime::task::harness::poll_future<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>::{closure#0}>, core::task::poll::Poll<()>>
platform-checks-materialized-1  |              at ./rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panic.rs:142:14
platform-checks-materialized-1  |   38: tokio::runtime::task::harness::poll_future::<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/harness.rs:464:18
platform-checks-materialized-1  |   39: <tokio::runtime::task::harness::Harness<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>>::poll_inner
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/harness.rs:198:27
platform-checks-materialized-1  |   40: <tokio::runtime::task::harness::Harness<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>>::poll
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/harness.rs:152:15
platform-checks-materialized-1  |   41: tokio::runtime::task::raw::poll::<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>, tokio::runtime::blocking::schedule::BlockingSchedule>
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/raw.rs:255:5
platform-checks-materialized-1  |   42: <tokio::runtime::task::raw::RawTask>::poll
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/raw.rs:200:18
platform-checks-materialized-1  |   43: <tokio::runtime::task::UnownedTask<tokio::runtime::blocking::schedule::BlockingSchedule>>::run
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/task/mod.rs:431:9
platform-checks-materialized-1  |   44: <tokio::runtime::blocking::pool::Task>::run
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/blocking/pool.rs:159:9
platform-checks-materialized-1  |   45: <tokio::runtime::blocking::pool::Inner>::run
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/blocking/pool.rs:513:17
platform-checks-materialized-1  |   46: <tokio::runtime::blocking::pool::Spawner>::spawn_thread::{closure#0}
platform-checks-materialized-1  |              at ./cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.27.0/src/runtime/blocking/pool.rs:471:13
platform-checks-materialized-1  | note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
platform-checks-materialized-1  | cluster-u7-replica-u7: 2023-08-28T19:48:44.994327Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u5-replica-u5: 2023-08-28T19:48:44.994839Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: bytes remaining on stream", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: Other, error: "bytes remaining on stream" }) })) }
platform-checks-materialized-1  | cluster-u1-replica-u1: 2023-08-28T19:48:44.995891Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: connection reset", source: Some(hyper::Error(Body, Error { kind: Io(Kind(ConnectionReset)) })) }
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:44.996114Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u4-replica-u4: 2023-08-28T19:48:44.996168Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u1-replica-u1: 2023-08-28T19:48:44.996877Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: connection reset", source: Some(hyper::Error(Body, Error { kind: Io(Kind(ConnectionReset)) })) }
platform-checks-materialized-1  | cluster-u5-replica-u5: 2023-08-28T19:48:44.997918Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:44.998547Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u6-replica-u6: 2023-08-28T19:48:44.998647Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u2-replica-u2: 2023-08-28T19:48:44.999813Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u2-replica-u2: 2023-08-28T19:48:44.999816Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: bytes remaining on stream", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: Other, error: "bytes remaining on stream" }) })) }
platform-checks-materialized-1  | cluster-s1-replica-s1: 2023-08-28T19:48:45.000775Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u1-replica-u1: 2023-08-28T19:48:45.001313Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: connection reset", source: Some(hyper::Error(Body, Error { kind: Io(Kind(ConnectionReset)) })) }
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:45.002382Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u5-replica-u5: 2023-08-28T19:48:45.002401Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u5-replica-u5: 2023-08-28T19:48:45.002570Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: connection reset", source: Some(hyper::Error(Body, Error { kind: Io(Kind(ConnectionReset)) })) }
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:45.002620Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u1-replica-u1: 2023-08-28T19:48:45.003692Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u2-replica-u2: 2023-08-28T19:48:45.004443Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: connection reset", source: Some(hyper::Error(Body, Error { kind: Io(Kind(ConnectionReset)) })) }
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:45.005583Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u3-replica-u3: 2023-08-28T19:48:45.005705Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-u2-replica-u2: 2023-08-28T19:48:45.005895Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }
platform-checks-materialized-1  | cluster-s2-replica-s2: 2023-08-28T19:48:45.014270Z  WARN mz_persist_client::rpc: pubsub client error: Status { code: Unknown, message: "h2 protocol error: error reading a body from connection: stream closed because of a broken pipe", source: Some(hyper::Error(Body, Error { kind: Io(Custom { kind: BrokenPipe, error: "stream closed because of a broken pipe" }) })) }

I have no idea if it's related. Also I think all the sending progress for gate ts: \d+ and downgrading write frontier to: \d+ logs are just noise.

jkosh44 commented 1 year ago

The storage team is taking a look at the Kafka issue. I'm going to hold off on further investigation until that's fixed. It was happening for me over 50% of the time. So if it's not the cause of this issue then it at least makes it very difficult to repro this issue.

philip-stoev commented 11 months ago

GRANT is also affected:


2023-09-12 11:37:43 UTC | >> GRANT ALL PRIVILEGES ON CLUSTER privilege_cluster1 TO role_2
-- | --
  | 2023-09-12 12:34:10 UTC | # Received cancellation signal, interrupting
jkosh44 commented 11 months ago

GRANT is also affected:


2023-09-12 11:37:43 UTC | >> GRANT ALL PRIVILEGES ON CLUSTER privilege_cluster1 TO role_2
-- | --
  | 2023-09-12 12:34:10 UTC | # Received cancellation signal, interrupting

Did you see the Kafka error in the logs after?

philip-stoev commented 11 months ago

@jkosh44 , no , no panics in any of the services.log from the CI. I am disabling all the parallel Nightly CI steps as they now hang like this:


2023-09-17 14:40:11 EEST | > CREATE SECRET [query truncated on purpose so as to not reveal the secret in the log]
-- | --
  | 2023-09-17 14:40:11 EEST | rows match; continuing at ts 1694950811.6181319
  | 2023-09-17 15:36:00 EEST | # Received cancellation signal, interrupting

So RBAC may or may not actually be involved ?

jkosh44 commented 11 months ago

A thought: make sure the ExecuteContext is correctly being plumbed in that operation. Maybe alter cluster owner was implemented around the same time as ExecuteContext and so there was a logical merge skew and it gets incorrectly dropped or ignored.

@mjibson Sorry, just seeing this now. It looks like it's being plumbed:

https://github.com/MaterializeInc/materialize/blob/81089fe8c8a05a7d4c85f59b19b83a3a2457a48e/src/adapter/src/coord/sequencer.rs#L528-L531

If it was incorrect would it always hang though? We're only seeing occasional hangs.

maddyblue commented 11 months ago

If it was incorrect would it always hang though?

That seems likely, so must be something else.

jkosh44 commented 11 months ago

A theory I just had is that this is somehow related to https://github.com/MaterializeInc/materialize/issues/21891, in that a CRDB query on DDL hangs for some unknown reason. If I'm ever able to repro this, I'll try updating the idle transaction timeout and see if that helps.

jkosh44 commented 9 months ago

@philip-stoev was this resolved? It looks like https://github.com/MaterializeInc/materialize/pull/21320 has been reverted.

jkosh44 commented 8 months ago

@philip-stoev was this resolved? It looks like #21320 has been reverted.

@def- any thoughts?

philip-stoev commented 8 months ago

@jkosh44 I will try to re-enable the test and see how it goes.

nrainer-materialize commented 8 months ago

@jkosh44 I will try to re-enable the test and see how it goes.

I will take that over because "Detect references to already closed issues" in Nightly fails since some time because of that. => https://github.com/MaterializeInc/materialize/pull/23881