risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
6.87k stars 569 forks source link

perf(cdc): improve the scanning the schema of the upstream CDC source #16622

Open KeXiangWang opened 4 months ago

KeXiangWang commented 4 months ago

Currently, the performance of scanning the schema of the upstream CDC source is insufficient, sometimes leads to timeout error. The issue is partially addressed in https://github.com/risingwavelabs/risingwave/pull/16598, but we still need further optimizations, including the addition of relevant trigger scripts, testing, and continuous code improvements. See https://github.com/risingwavelabs/risingwave/pull/16598#pullrequestreview-2042008887

Another issue about this is that the current error message is far from clear. Need to be improved. https://github.com/risingwavelabs/risingwave/pull/16598#issuecomment-2099650142

StrikeW commented 4 months ago

scanning the schema of the upstream CDC source

It is a step when we start the Debezium MySQL connector, we may need to look into the implementation of Debezium.