risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
6.97k stars 575 forks source link

bug: tumble fails parallel e2e test #12429

Closed zwang28 closed 1 year ago

zwang28 commented 1 year ago

Describe the bug

Since https://buildkite.com/risingwavelabs/main-cron/builds/849

thread 'risingwave-streaming-actor' panicked at src/expr/src/vector_op/tumble.rs:91:12:
attempt to subtract with overflow
stack backtrace:
thread 'risingwave-streaming-actor' panicked at src/expr/src/vector_op/tumble.rs:91:12:
attempt to subtract with overflow
2023-09-14T20:34:35.209033552Z DEBUG risingwave_stream::task::stream_manager: drop actors actors=[4964, 4965, 4966, 4967, 4976, 4977, 4978, 4979, 4988, 4989, 4990, 4991, 5000, 5001, 5002, 5003]
2023-09-14T20:34:35.249759635Z DEBUG risingwave_stream::task::stream_manager: drop actors actors=[5748, 5749, 5750, 5751, 5760, 5761, 5762, 5763]
thread 'risingwave-streaming-actor' panicked at src/expr/src/vector_op/tumble.rs:91:12:
attempt to subtract with overflow
thread 'risingwave-streaming-actor' panicked at src/expr/src/vector_op/tumble.rs:91:12:
attempt to subtract with overflow
   0: rust_begin_unwind
             at /rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/std/src/panicking.rs:619:5
   1: core::panicking::panic_fmt
             at /rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/core/src/panicking.rs:72:14
   2: core::panicking::panic
             at /rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/core/src/panicking.rs:127:5
   3: get_window_start_with_offset
   4: get_window_start
             at ./src/expr/src/vector_op/tumble.rs:51:5
   5: tumble_start_timestamptz
             at ./src/expr/src/vector_op/tumble.rs:45:5
   6: {async_block#0}
             at ./src/expr/src/vector_op/tumble.rs:43:1
*** await tree context of current task ***

stack backtrace:
Actor 6696: `CREATE MATERIALIZED VIEW eowc_mv AS SELECT window_start, count(id1) FROM tumble(temporal_join_mv, v1, INTERVAL '5 s') GROUP BY window_start EMIT ON WINDOW CLOSE` [1.110s]
  Epoch 5078284702121984 [779.996ms]
    ProjectExecutor 1A2800000001 (actor 6696, operator 1) [779.996ms]  <== current

   0: rust_begin_unwind
             at /rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/std/src/panicking.rs:619:5
   1: core::panicking::panic_fmt
             at /rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/core/src/panicking.rs:72:14
   2: core::panicking::panic
             at /rustc/62ebe3a2b177d50ec664798d731b8a8d1a9120d1/library/core/src/panicking.rs:127:5
   3: get_window_start_with_offset
   4: get_window_start
             at ./src/expr/src/vector_op/tumble.rs:51:5
   5: tumble_start_timestamptz
             at ./src/expr/src/vector_op/tumble.rsThu Sep 14 20:38:31 UTC 2023 [risedev]: Program exited with 134

Error message/log

No response

To Reproduce

No response

Expected behavior

No response

How did you deploy RisingWave?

No response

The version of RisingWave

No response

Additional context

No response

zwang28 commented 1 year ago

@chenzl25 The newly added temporal_join_watermark.slt will fail parallel e2e tests.

chenzl25 commented 1 year ago

I think it is related to another issue https://github.com/risingwavelabs/risingwave/issues/12300 I am not sure if an append-only table with a watermark is well-supported. cc @st1page @yuhao-su

chenzl25 commented 1 year ago

Since temporal_join_watermark.slt uses an append-only table to generate watermarks and temporal join won't modify the watermark value but a panic occurs on tumble. It seems the watermark value is unexpected.

st1page commented 1 year ago

https://github.com/risingwavelabs/risingwave/issues/12432 https://github.com/risingwavelabs/risingwave/issues/12433