concourse / hush-house

Concourse k8s-based environment
https://hush-house.pivotal.io
29 stars 23 forks source link

Thoughts on bumping prometheus/grafana requirements? #62

Closed aegershman closed 4 years ago

aegershman commented 5 years ago

Hey! In order to consume the latest 'n greatest prometheus and grafana helm charts, has there been consideration into bumping the deployments/with-creds/metrics/requirements.yml?

No rush to respond, just curious if there's a specific technical reason or if it's a "if it isn't broke don't fix it" situation, which I absolutely respect. Just to be safe, currently on our k8s Concourse/prometheus/grafana deployments we're pinning to the same versions of those charts (8.10.3 & 3.3.10 respectively), and we haven't felt compelled to diverge.

Just curious; thanks for your time. Feel free to close whenever.

cirocosta commented 5 years ago

Hey @aegershman ,


We've been bumping those on the basis of "oh, there's a cool new feature out there!" 😅

That's because we don't really have to do much with them - it's all about "just enough" for us to troubleshoot in case alarms go off.

Have you been exploring other ways of bumping dependencies? I remember you shared Helmfile, but I guess that doesn't help much? (I didn't look at it yet 😬 )


Thanks!

aegershman commented 5 years ago

I totally get it, I feel similarly; I currently don't have a good way to override/bump a chart's requirements.yml subcharts, so instead what I do is deploy prometheus/grafana as separate independent deployments which are pinned to the same version declared in a chart's requirements.yml. That way they can more easily change independently... Except for Concourse's postgres chart requirement. On our concourse-web deployment, it's using the postgres subchart directly. So we might go back and redo that, but for the time being it's not a huge deal 🤷‍♂

I completely agree though-- in general, if it isn't broke, don't fix it. The only reason I ask is because the greater the delta between one version to the next, the scarier it is to make a change. With prometheus/grafana it's not an earth shattering problem if something gets horribly bonked during an upgrade & it has to be redeployed fresh, but it still takes time to debug/fix/etc.


Helmfile helps bumping chart dependencies by either specifying a ~8.2.x optimistic version pattern if you're feeling risky 'n frisky, wherein each time you apply changes it'll take whatever the latest patch is available-- OR you can be explicitly declarative about the version used (version: 8.2.2, etc.) and use helmfile diff to check for outdated versions && stitch together a ci/cd system that creates a PR to update the version: stored in git, etc.

But I'm not sure how it helps with bumping chart subcharts, because I don't even know how to override that currently lol.


Regardless it ain't a big deal, was just curious. Am 100% down to keep discussing but also feel free to close this out (and the Helmfile issue) whenever 👍