Open chenzl25 opened 10 months ago
because a low barrier interval could cause more .. backfill iterators recreations
IIUC, the backfill loop cycle does not have to be synced with the barrier interval. So what about tuning the "loop frequency" for backfill executor to check how this factor matters? Also, considering that the barrier interval is a global parameter, I guess this can also be eventually be an optimization that is less invasive.
8000 32430.144 ms (00:32.430) 10000 40395.891 ms (00:40.396)
As the backfill only finishes on the boundary barrier, is it possible that most of the work has been done on 24s and 30s (or even shorter) for the barrier interval of 8s and 10s, respectively? If so, there seems no much difference that 2~3s. Perhaps we need a larger amount of data.
IIUC, the backfill loop cycle does not have to be synced with the barrier interval. So what about tuning the "loop frequency" for backfill executor to check how this factor matters? Also, considering that the barrier interval is a global parameter, I guess this can also be eventually be an optimization that is less invasive.
+1, if the backfill loop interval is too small it would affect the backfill throughput, eventually causing a high mv creation time.
This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.
I conducted a simple experiment and want to answer 2 questions:
The experiment shows that there exists an optimal barrier interval for backfilling which couldn't be too high or too low (assuming one checkpoint per barrier) because a low barrier interval could cause more checkpoints, backfill iterators recreations, and trigger more aggregation emitting. In contrast, a high barrier interval might need to wait for a first barrier but backfill does nothing? Not quite sure. Even the best streaming query performance 25s vs 11s the batch query performance seems to have a gap and could we shorten it?
Experiment:
Table t with 12,000,000 rows and table size is about 1.9GiB and partitions (rw_parallelism) is 32.
1 Compute Node: 2c4g
Batch:
Time: 11317.226 ms (00:11.317)
Streaming: