backfill test: Storage error: Hummock error: Expired Epoch: watermark 6934584757125120, epoch 6934584644927488

kwannoel commented 1 month ago

buildkite: https://buildkite.com/risingwavelabs/main-cron/builds/3046#01912d92-7eef-47eb-ae2a-9679177ef2a9

Error:

e2e_test/backfill/runtime/validate_rows_no_shuffle.slt       .. [FAILED]
failed to run `e2e_test/backfill/runtime/validate_rows_no_shuffle.slt`
Caused by:
    query failed: db error: ERROR: Failed to run the query
    Caused by:
      Storage error: Hummock error: Expired Epoch: watermark 6934584757125120, epoch 6934584644927488
    [SQL] select (select count(*) from no_shuffle_backfill) = (select count(*) from t);
    at e2e_test/backfill/runtime/validate_rows_no_shuffle.slt:1

This occurs when running the following batch query after no shuffle backfill:

query I
select (select count(*) from no_shuffle_backfill) = (select count(*) from t);
----
t

hzxa21 commented 1 month ago

@wenym1 Can you help take a look? I checked the relevant code paths and didn't find a clue on how this can happen.

meta: MV creation finishes at around 16:36:06

2024-08-07T16:35:47.098154218Z  INFO risingwave_meta::rpc::ddl_controller: finish stream job succeeded id=1
2024-08-07T16:36:00.777084324Z  INFO risingwave_meta::rpc::ddl_controller: finish stream job succeeded id=2
2024-08-07T16:36:06.114203651Z  INFO risingwave_meta::rpc::ddl_controller: finish stream job succeeded id=4

frontend: Query fail at around 16:36:08

2024-08-07T16:36:08.819965654Z ERROR risingwave_frontend::scheduler::distributed::stage: Task TaskId { query_id: "ffaf7aeb-8a3e-4035-89c5-747bb7d5f707", stage_id: 1, task_id: 8 } failed, reason: "Storage error: Hummock error: Expired Epoch: watermark 6934584757125120, epoch 6934584644927488"
2024-08-07T16:36:08.820039093Z ERROR risingwave_frontend::scheduler::distributed::stage: Task TaskId { query_id: "ffaf7aeb-8a3e-4035-89c5-747bb7d5f707", stage_id: 2, task_id: 4 } failed, reason: "Storage error: Hummock error: Expired Epoch: watermark 6934584757125120, epoch 6934584644927488"
2024-08-07T16:36:08.820048701Z ERROR risingwave_frontend::scheduler::distributed::query: query stage failed error=Storage error: Hummock error: Expired Epoch: watermark 6934584757125120, epoch 6934584644927488 query_id=QueryId { id: "ffaf7aeb-8a3e-4035-89c5-747bb7d5f707" } stage_id=1
2024-08-07T16:36:08.820286629Z ERROR pgwire::pg_protocol: error when process message error=Failed to run the query: Storage error: Hummock error: Expired Epoch: watermark 6934584757125120, epoch 6934584644927488

compactor: safe epoch is bumped to 6934584757125120 by compact task 23 finished at 16:36:08

2024-08-07T16:36:07.022009424Z  INFO risingwave_storage::hummock::compactor::compactor_runner: Ready to handle task: 23 compact_task_statistics CompactTaskStatistics { total_file_count: 8, total_key_count: 674, total_file_size: 21770, total_uncompressed_file_size: 21698 } compression_algorithm 0  parallelism 1 task_memory_capacity_with_parallelism 17342096, enable fast runner: false, Compaction task id: 23, group-id: 2, type: Dynamic, target level: 0, target sub level: 6934584478662656 watermark: 6934584757125120, target_file_size: 33554432, splits: 1
Input: 
Level 0 ["[id: 157, obj_id: 138 0KB stale_ratio 0]", "[id: 158, obj_id: 140 0KB stale_ratio 0]"] 
Level 0 ["[id: 155, obj_id: 90 5KB stale_ratio 80]"] 
Level 0 ["[id: 147, obj_id: 75 2KB stale_ratio 0]", "[id: 148, obj_id: 77 0KB stale_ratio 0]", "[id: 150, obj_id: 96 2KB stale_ratio 0]", "[id: 151, obj_id: 98 0KB stale_ratio 0]", "[id: 153, obj_id: 88 7KB stale_ratio 66]"] 

2024-08-07T16:36:08.021001239Z  INFO risingwave_storage::hummock::compactor: running_parallelism_count=2 pull_task_ack=false pending_pull_task_count=4
2024-08-07T16:36:08.047053706Z  INFO risingwave_storage::hummock::compactor: Finish Task Some(23) split_index Some(0) sst count 1
2024-08-07T16:36:08.047124571Z  INFO risingwave_storage::hummock::compactor::compactor_runner: Finished compaction task in 1025.137239ms: Compaction task id: 23, group-id: 2, type: Dynamic, target level: 0, target sub level: 6934584478662656 watermark: 6934584757125120, target_file_size: 33554432, splits: 1, status: Success
Output: 
SstableInfo: object id=128, SST id=128, KeyRange=["0000000500000080000018a2f7c09b0000","0000000500ff0080ff0018a2f7c09b0000"], table_ids: [5], size=8KB, stale_ratio=0%, bloom_filter_kind Sstable

hzxa21 commented 1 month ago

It seems that after this PR #5832, the storage iterator creation is moved from new_boxed_executor to do_execute. Frontend will unpin epoch after the table scan stage is Scheduled. Will the table scan stage of a distributed query become Scheduled after new_boxed_executor or after do_execute? If it is the former, it is possible for the epoch to be unpinned before the storage version is pinned, which is done in storage iterator creation. cc @BugenZhao @chenzl25

chenzl25 commented 1 month ago

I remember our SeqScanExecutor will create iterators during execute() sequentially which means we can't unpin epoch after the executor creation. Instead, we should unpin snapshot until the end of the query execution as how we deal with lookup join.

risingwavelabs / risingwave

backfill test: Storage error: Hummock error: Expired Epoch: watermark 6934584757125120, epoch 6934584644927488 #17965