cloudfoundry / cf-deployment

The canonical open source deployment manifest for Cloud Foundry
Apache License 2.0
294 stars 306 forks source link

Integration of "system-metrics" configuration #1043

Closed jochenehret closed 1 year ago

jochenehret commented 1 year ago

Dear Community,

we would like to discuss the future integration of the system-metrics components into cf-deployment: https://github.com/cloudfoundry/system-metrics-release https://github.com/cloudfoundry/system-metrics-scraper-release

Currently the configuration is defined in an experimental ops-file: https://github.com/cloudfoundry/cf-deployment/blob/main/operations/experimental/add-system-metrics-agent.yml

As the system-metrics components are mature, we could either promote the experimental ops file to a regular ops file (=opt-in) or integrate the configuration into cf-deployment.yml with a new ops file to disable (=opt-out).

To make a good decision, we would like to know from cf-deployment consumers

Please add a short comment to this issue. Thanks for your collaboration!

Jochen.

lodener commented 1 year ago

For a better insight into VM health, we make use of the node-exporter-boshrelease. This provides us with a wider array of metrics when compared to the system-metrics component. An explicit opt-in to the system-metrics component would therefor have our vote.

chombium commented 1 year ago

Hi,

at SAP, we are using a custom, home-grown monitoring solution for the system metrics based on Telegraf, so for us the promotion of the system-metrics-release and the system-metrics-sraper-release is irrelevant. Whatever integration method is chosen we will (leave) deactivate(d) the releases via an custom ops file.

I was analyzing the code of the both releases and "played a bit" with them and I can write few pros and cons for the releases:

chombium commented 1 year ago

For a better insight into VM health, we make use of the node-exporter-boshrelease.

@lodener I see that the latest release of this bosh release was done in 2020 and it uses the Prometheus node-exporter v1.0.1. I'd suggest that you update the release as the latest release of the node-exporter is 1.5.0 ;)

Benjamintf1 commented 1 year ago

I still think there's a strong benefit to accessibility and consistency within cf with regards to metrics. In that respect, I think because there's some benefit with system-metrics release. I think there's also benefits to be able to edit and maintain metrics with respect to how they're useful for cf. That said, it wouldn't be that hard to tie node-exporter into a vision that includes the consistency and accessibility I think is important.

All that said, I'd be ok just promoting system-metrics and the scraper to being a regular opsfile rather then an experimental one.