We currently have a changefeed.flushes metric that counts the number of aggregator sink flushes and changefeed.flush_hist_nanos metric which builds a histogram of the duration of each flush, but we don't have any logs or metrics that tell us why a flush happened.
From a quick skim of the code, it seems like there are three main reasons we flush:
We currently have a
changefeed.flushes
metric that counts the number of aggregator sink flushes andchangefeed.flush_hist_nanos
metric which builds a histogram of the duration of each flush, but we don't have any logs or metrics that tell us why a flush happened.From a quick skim of the code, it seems like there are three main reasons we flush:
We should add more metrics (or logs) to help us distinguish these and any other reasons we flush. (Maybe something like
changefeed.flush.<reason>
.)Jira issue: CRDB-41843
Epic CRDB-42868