cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.13k stars 3.81k forks source link

changefeedccl: clarify changefeed retry behavior #64646

Closed stevendanna closed 1 year ago

stevendanna commented 3 years ago

CHANGEFEEDS send data to external systems (sinks) over the network. Over the life of a changefeed, it is likely that attempts to send data to a sink will fail with network errors, cloud permissions problems, server-level errors from the given sink, and more. For most errors, we want to retry these external requests as the issue is most likely transient. Currently, we have 3 ways in which a changefeed may be retried:

We've hit a number of problems recently with our current retries:

Addressing all of these issues will likely require a number of changes. In a recent conversations, we discussed some initial improvements we could make:

From there, we can improve our ability to recognise and specifically mark fatal errors as fatal. Improving retries in the kafka sink may further require that we rely less on the sarama library.

Jira issue: CRDB-7181

github-actions[bot] commented 1 year ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!