cloudfoundry / cloud_controller_ng

Cloud Foundry Cloud Controller
Apache License 2.0
187 stars 356 forks source link

Fix cc.requests.outstanding.gauge when using puma web server #3841

Closed Samze closed 1 month ago

Samze commented 1 month ago

https://github.com/cloudfoundry/cloud_controller_ng/issues/1312 introduced cc.requests.outstanding.gauge which holds the counter in memory. With the introduction of puma there may be multiple processes, so each would emit its own value for this metric. This would cause the gauge to flop between values. This metric is listed as an important kpi for capi scaling https://docs.cloudfoundry.org/running/managing-cf/scaling-cloud-controller.html#cloud_controller_ng.

This fix for puma will instead uses Redis for the gauge.

Inspired by https://github.com/cloudfoundry/cloud_controller_ng/commit/4539e596ab6ae64556b170e4387633e9ebd55292

An alternative considered, was to read the prometheus metric and re-emit that to StatsD, however we observed performance degradation. Presumably because of the number of reads from disk for the DirectFileStorage to aggregate the metric across processes and so Redis seemed the best approach.

cc @sethboyles / @pivotalgeorge


sethboyles commented 1 month ago

Code looks good, I'll do acceptance tomorrow