cdc: records after schema change take so long to receive

risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.

https://go.risingwave.com/slack

Apache License 2.0

6.9k stars 570 forks source link

cdc: records after schema change take so long to receive #12476

Open BugenZhao opened 1 year ago

BugenZhao commented 1 year ago

https://github.com/risingwavelabs/risingwave/blob/64602acb400cc79f6641de285a186d219cf0605d/e2e_test/source/cdc_inline/alter/postgres_alter.slt#L31-L47

Considering that we have to rebuild the actors of the table, I suppose there'll be a procedure that we need to create a new entry in the meta's split manager. Is there a possibility that we need to wait for a tick interval before the new source executors can reconnect to the CDC again?

fuyufjh commented 1 year ago

@StrikeW Pls take a look

hzxa21 commented 11 months ago

@StrikeW Is this fixed already by #12868 ?

StrikeW commented 11 months ago

@StrikeW Is this fixed already by #12868 ?

I have spotted the cause of this issue, which is not related to #12868. Will submit a PR soon.

StrikeW commented 10 months ago

Update to this issue: the reason why it takes a while before receiving upstream events is that we use the same identity to create a new connector in the process of ReplceTable. In Debezium, the database.server.name property should be unique across different instance of connectors, otherwise it will retry multiple times to register the JMX metric. We may find ways to patch the code of Debezium to reduce the number of retry times.

hzxa21 commented 10 months ago

Update to this issue: the reason why it takes a while before receiving upstream events is that we use the same identity to create a new connector in the process of ReplceTable. In Debezium, the database.server.name property should be unique across different instance of connectors, otherwise it will retry multiple times to register the JMX metric. We may find ways to patch the code of Debezium to reduce the number of retry times.

Can we generate a unique id for database.server.name?

hzxa21 commented 6 months ago

@StrikeW Is the issue fixed?

StrikeW commented 6 months ago

@StrikeW Is the issue fixed?

I think not. Will take a look this month.

hzxa21 commented 3 days ago

@StrikeW Any updates?