concourse / prod

bosh/terraform config for our deployments
3 stars 5 forks source link

move hush-house monitoring pipelines and unmute datadog alerts #42

Closed jamieklassen closed 4 years ago

jamieklassen commented 4 years ago

concourse/oxygen-mask is where they live. think about fitting this into the reconfigure-pipelines pipeline. think about creating a separate worker just for running these pipelines? @kcmannem has an opinion

xtremerui commented 4 years ago

Thinking about this, are testing pipelines in oxygen-mask changing frequently that needs to be put into auto reconfigure?

xtremerui commented 4 years ago

Also we have monitoring pipline under team main (for old ci SLI itselt) and monitoring pipeline under team monitoring-hush-house (for hush-house.pivotal.io). I assume this issue is only for monitoring-hush-house. And as a result there won't be SLI monitoring of new CI anymore. Is it correct?

jamieklassen commented 4 years ago

@xtremerui I'd still put it in reconfigure-pipelines - that pipeline ends up being really nice configuration-as-code so everybody knows what the source of truth for this environment is. It serves as good documentation, and if we ever need to lift and shift our setup again, it will all be much more reproducible. These things make adding it to the reconfigure-pipelines pipeline worthwhile to me, moreso than the fact that it's changing frequently (because it's not 😅).

I don't think it makes sense for the environment to monitor itself, so yeah let's just have it monitor hush-house. I know that's still the same cluster so it's kinda monitoring the same infrastructure that it's using, but at least its a separate node pool. So if the whole hush-house cluster goes down we'll still lose SLI reporting, but if just the hush-house worker pool goes down SLIs will continue to report

xtremerui commented 4 years ago