cockroachdb / replicator

replicator is a toolkit for ingesting logical replication feeds into a CockroachDB cluster
Apache License 2.0
60 stars 24 forks source link

Make the CheckConsistency run less frequently #1051

Closed ryanluu12345 closed 1 month ago

ryanluu12345 commented 1 month ago

The default should run less frequently than it currently does. This should also be configurable. The impact here is that if it runs less frequently, alerting analytics will be slower to fire leading to a bigger window for inconsistency.

ryanluu12345 commented 1 month ago

@bobvawter I've taken an initial look at this and have a high level idea of how to implement this. The current thinking I have here is to:

  1. Drill the check consistency duration inside of the factory struct inside of the stage package. The key reason for this is that we can pass in a new config object inside of stage.ProvideFactory that lets the frontends configure this via a flag.
  2. Update the newStage so that it takes in this value from the factory struct field
  3. Update the hardcoded duration in the polling logic so that it uses this drilled in value

I don't have questions about the above methodology, but I do have open questions about alternatives I'm considering:

  1. Currently we have this query running every minute. I'm wary there is an implication of reporting consistency errors slower if we dial this down to be less frequent. I'm currently debating between 1.5 minutes to 2 minutes (leaning towards the latter). What are your thoughts here? Or do you think it should be even less frequent?
  2. I see that in some of the other frontend configs (i.e. kafka, mylogical, pglogical) we already have a Staging configuration object. Originally I was thinking I could make a new config.go inside of the internal/staging/stage package, but after seeing there is already a Staging package, wondering if I should put it inside of there? The one consideration I have here is that if we do it this way, then we'll need to have stage's ProvideFactory reference the sinkprod.StagingConfig. I also don't like that the flag and its default will be defined in a package not colocated with the newStage consistency check logic.
bobvawter commented 1 month ago

I'm just about done implementing this. I'd like to present it to the customer this afternoon.

ryanluu12345 commented 1 month ago

Noted. Thanks! Please let me know when it is ready and I can review the PR.