concourse / hush-house

Concourse k8s-based environment
https://hush-house.pivotal.io
29 stars 23 forks source link

metrics: improve Concourse dashboard and prometheus perf #76

Closed cirocosta closed 4 years ago

cirocosta commented 4 years ago

this is an incremental improvement of the readability of the concourse dashboard, focusing on showing only what matters, while still making periods configurable.

the theory behind this change is that we're usually looking to what the "highest" of something is (rather than all of it), and in some cases, a rate (/s) is too awkward to reason about (thus, getting rates over longer periods of time).

ps.: both changes mentioned above are not hardcoded - a viewer can change those values anytime.

Screen Shot 2019-11-22 at 4 33 16 PM Screen Shot 2019-11-22 at 4 37 24 PM

having a 30GB for the Prometheus server is something that made sense back then when we had very few machines and services running on the hush-house cluster, but nowadays, that is not enough, thus, here we bump from 30GB to 300GB.

aside from the disk bump, now we're being more careful with which metrics we consume, more specifically:

now we also have a fancier build vis

Screen Shot 2019-11-22 at 5 13 43 PM

(more info in the commits themselves)

cirocosta commented 4 years ago

aand here's the effect of whitelisting the namespaces that we care about when ingesting cadvisor samples:

Screen Shot 2019-11-22 at 5 19 33 PM
cirocosta commented 4 years ago

the changes have been already applied, moving on with the merge