concourse / hush-house

Concourse k8s-based environment
https://hush-house.pivotal.io
29 stars 23 forks source link

metrics: increase prometheus data retention #77

Closed cirocosta closed 4 years ago

cirocosta commented 4 years ago

By default, the prometheus Chart configures a retention policy that's limited to 15 days.

## Prometheus data retention period (default if not specified is 15 days)
##
retention: "15d"

(https://github.com/helm/charts/blob/67ed74b614bb5f4e068017101a8673c63459f383/stable/prometheus/values.yaml#L841-L843)

(the change would be somewhere here:

https://github.com/concourse/hush-house/blob/a14d0832ecac5753c138a9287e12a3be375cc1a5/deployments/with-creds/metrics/values.yaml#L24

)

While that's been in our favor when it comes to having to have a very small disk (30GB), the price of not having the data for longer than just 15d turns out to be bigger (when, for instance, comparing how Concourse performed over a longer period of time).

I'm not entirely sure of what a good number for this is, but, with the intention of shipping Concourse on a 3-week basis, it seems reasonable to me to have at least 6 weeks of retention (so that we have the data for the entire period between a deploy and another.

Thanks!

cirocosta commented 4 years ago

https://github.com/concourse/hush-house/pull/101