cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.12k stars 3.81k forks source link

streamingccl: add even more LDR metrics to DB console #125529

Open msbutler opened 5 months ago

msbutler commented 5 months ago

After #125320, a @dt @ajstorm wished for more metrics. We should add these when we have dev time or wish we had them during a debugging session:

Source side

Destination side

Jira issue: CRDB-39500

blathers-crl[bot] commented 5 months ago

cc @cockroachdb/disaster-recovery

msbutler commented 5 months ago

@ajstorm You also asked for the following, but i dont quite follow what you're asking for. could you provide more context on what you were looking into and what you wanted to see?

- [ ] buffer consumption rangefeed source - amount of rangefeed buffer we're consuming at the source 
- [ ] buffer consumption producer - amount of data buffered at the producer side
ajstorm commented 4 months ago

@ajstorm You also asked for the following, but i dont quite follow what you're asking for. could you provide more context on what you were looking into and what you wanted to see?

- [ ] buffer consumption rangefeed source - amount of rangefeed buffer we're consuming at the source 
- [ ] buffer consumption producer - amount of data buffered at the producer side

The latter might not be necessary anymore, because @dt seems to have pulled out the buffering on the producer side last night. What I was referring to on the source side is the buffer that backs this error message. IIUC, when that buffer fills, we're going to get a REASON_SLOW_CONSUMER error and have to perform a catchup scan. If that's the case, it would be good to have metrics to know how close we're getting to that limit.

ajstorm commented 4 months ago

Adding a request for a metric that tracks how far behind we are when performing an initial backfill (as mentioned here).

XiaochenCui commented 3 months ago

I'll work on this issue soon.