Reduce ETSCronFlusher memory consumption

akoutmos / prom_ex

An Elixir Prometheus metrics collection library built on top of Telemetry with accompanying Grafana dashboards

MIT License

577 stars 96 forks source link

Reduce ETSCronFlusher memory consumption #199

Closed IIILSW closed 1 year ago

IIILSW commented 1 year ago

Change description

We have registered high memory consumption by the ETSCronFlusher process heap and large number of attached ref binaries.

In the course of our experiments, abandoning the use of Exporter and placing the process in hibernate after cleanup allowed us to reduce memory consumption under load by ~70mb

What problem does this solve?

Issue number: #198

Example usage

Additional details and screenshots

For now using telemetry_metrics_prometheus_core from my fork cause it also requires changes

Checklist

[ ] I have added unit tests to cover my changes.
[x] I have added documentation to cover my changes.
[x] My changes have passed unit tests and have been tested E2E in an example project.

akoutmos commented 1 year ago

Good catch and next level debugging!! Do you think that you can open a PR to the prometheus_metrics_core lib with your changes? Else, another option would be to run the actual flushing inside of a Task. Once that Task terminates, the heap memory will be reclaimed. That way you can avoid the prometheus_metrics_core PR.

Thoughts?

akoutmos commented 1 year ago

Can you test out this branch and see if it fixes you problem? https://github.com/akoutmos/prom_ex/pull/200

IIILSW commented 1 year ago

Thank you for your reply! I tested this approach yesterday and it worked fine. Before answering, I was going to assemble a more productive stand and investigate how large binary construction in TelemetryMetricsPrometheus.Core.Exporter will affect memory allocation. I'll test it and return with results.

Your suggestion seems reasonable, and it is at least a good first step that can be taken soon. Closed MR in prometheus_metrics_core, thank you again

akoutmos commented 1 year ago

Sounds good. I'll close this PR for now and merge in #200 in that case. Thanks!