cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.05k stars 3.8k forks source link

Add prometheus alert for read amplification issues #56129

Open glennfawcett opened 3 years ago

glennfawcett commented 3 years ago

Is your feature request related to a problem? Please describe. Ran into an issue with runaway amplification

Describe the solution you'd like New Prometheus alert to be created to notify of high read amplification sooner.

Screen Shot 2020-10-23 at 9 16 44 AM Screen Shot 2020-10-23 at 9 16 44 AM Screen Shot 2020-10-29 at 2 46 25 PM

Screen Shot 2020-10-29 at 2 46 25 PM

Basically, update:

Jira issue: CRDB-3573

blathers-crl[bot] commented 3 years ago

Hi @glennfawcett, I've guessed the C-ategory of your issue and suitably labeled it. Please re-label if inaccurate.

While you're here, please consider adding an A- label to help keep our repository tidy.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

tbg commented 1 year ago

FYI the L2 grafana dashboard^1 has this alert. (IO Admission Control)

image

which basically engages (when IO admission control is enabled and) the LSM is inverting.

This dashboard is something we're using for an internal testcluster. It's not a standard dashboard but I see it as an incubator ideas in which will be picked up over time.