cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.92k stars 3.78k forks source link

Sampled Query improvement on probabilistic sampling to drive more reliable data #85558

Open pransudash opened 2 years ago

pransudash commented 2 years ago

Is your feature request related to a problem? Please describe. Currently the sampled_query events from the telemetry logging channel only emits a subset of total queries based on probabilistic sampling. This is so that we do not send more than the maximum number of logging telemetry events per interval of time and clog the system. The events do have a counter for how many queries were not sampled (or skipped), but this does not allow us to determine the types or fingerprints of the skipped queries.

Take, as a very simplified example, a cluster that runs 1000 similar SELECT statements and 1 CREATE interleaved in between those within a time interval. The sampled_query events for that interval might only emit one SELECT statement with skippedQueries = 1000, thus causing us to lose visibility on the CREATE

Describe the solution you'd like We would like to revise the probabilistic sampling limits and potentially remove them. If that is not possible due to infrastructure concerns, which we totally understand, we would like to sample only at the statement_tag or statement_type level. It would be great if we had events where skippedQueries counted at the query fingerprint level so that we can be confident we're capturing every fingerprint in some sense. Something like query fingerprint type 1 was run N times and query fingerprint type 2 was run M times and so on.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

Jira issue: CRDB-18308

Epic: CRDB-24500

Epic CRDB-32141

blathers-crl[bot] commented 2 years ago

Hi @pransudash, I've guessed the C-ategory of your issue and suitably labeled it. Please re-label if inaccurate.

While you're here, please consider adding an A- label to help keep our repository tidy.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

thtruo commented 1 year ago

Another motivation from this internal Slack thread. Certain special queries (e.g. to understand AWS DMS -> COPY) get missed FYI @kevin-v-ngo

kevin-v-ngo commented 1 year ago

As issues come up, please share other 'special' queries that get missed. Thanks!