Support sink decoupling during backfill for CREATE SINK INTO TABLE

hzxa21 commented 2 weeks ago

Is your feature request related to a problem? Please describe.

Backfilling can backpressure upstream, causing the existing streaming jobs to be slower or even stuck. There are three cases where backfilling can happen:

CREATE MV
CREATE SINK with connector
CREATE SINK INTO TABLE

The current way to mitigate backfilling effect on upstream

SET BACKFILL_RATE_LIMIT to xxx. Supported for 1, 2, 3.
SET sink_decouple to true (default on). Supported for 2.
SET streaming_use_snapshot_backfill to true (default off, experimental now). Supported for 1.

The only effective way for 3 is use rate limit, which requires manual operation and understanding on the workload before determining a good value. Therefore, I think we should also support sink decoupling for sink into table as well. This is also a perquisite of doing severless backfill for sink into table.

Describe the solution you'd like

There are two ways to implement sink decoupling for sink into table:

Use kv log store for SINK INTO TABLE, similar to what we did for sink with connector.
Record L0 changelog and support snapshot backfilling for SINK INTO TABLE, similar to what we did for MV.

Describe alternatives you've considered

No response

Additional context

No response

st1page commented 2 weeks ago

The general idea LGTM. I think we need some more detailed design to ensure that the data in the log store can converge to 0

kwannoel commented 1 week ago

The work has overlap with unaligned join (log store executor can be used for both). Will write a design doc for this.

kwannoel commented 2 days ago

Actually why just during backfill? Shouldn't sink_decouple always let the downstream sink be decoupled from upstream?

st1page commented 2 days ago

Actually why just during backfill? Shouldn't sink_decouple always let the downstream sink be decoupled from upstream?

Outside of the backfilling period, the downstream MV will wait for the upstream barrier to align, and there is no way to make the downstream progress faster.

kwannoel commented 10 hours ago

Actually why just during backfill? Shouldn't sink_decouple always let the downstream sink be decoupled from upstream?

Outside of the backfilling period, the downstream MV will wait for the upstream barrier to align, and there is no way to make the downstream progress faster.

Why? If we use kv_log_store, it will just buffer the changes, and barrier can go pass once these changes have been written to the logstore.

st1page commented 4 hours ago

Actually why just during backfill? Shouldn't sink_decouple always let the downstream sink be decoupled from upstream?

Outside of the backfilling period, the downstream MV will wait for the upstream barrier to align, and there is no way to make the downstream progress faster.

Why? If we use kv_log_store, it will just buffer the changes, and barrier can go pass once these changes have been written to the logstore.

In the current design of other sink's logstore, barriers are also persisted in the logstore.
If you are considering using this approach. If there is a crash in the upstream sink, the changes on the downstream table cannot be updated exactly once.

risingwavelabs / risingwave