Open lunevalex opened 2 years ago
cc: @mwang1026 @maryliag @kevin-v-ngo
@lunevalex, can you help provide context on this issue? Is there an escalation you can send me?
Agree we should collect telemetry on this scenario but as for introducing user-facing observability, i'd love to understand what actions this would inform the user to take using this information.
Is your feature request related to a problem? Please describe.
CockroachDB already has a number of tools to track queries that perform a full table scan and highlght to users that they should consider optimizing such a query. Recently in a customer escalation we observed a pattern when there were a number of queries that needed every single node in the cluster to be available to complete, which could be equally problematic for a customer workload. There should be a way for the customer to identify, as they could be detrimental to the application stability. It is not quite clear what a customer should do when they find this pattern, as it's going to be very workload dependent, but observability is a start.
Describe the solution you'd like The ask is two-fold:
Jira issue: CRDB-16329