micrometer-metrics / micrometer

An application observability facade for the most popular observability tools. Think SLF4J, but for observability.
https://micrometer.io
Apache License 2.0
4.45k stars 981 forks source link

System memory metrics #5234

Open fabcmartins opened 3 months ago

fabcmartins commented 3 months ago

The class JvmMemoryMetrics offers memory metrics, but those are all related to heap and non heap values and do not expose the actual memory limits of the machine/container on which the application is running.

It'd be an improvement to offer some additional metrics in this class, such as system.memory.max system.memory.used

The metrics are available in the OperatingSystemMXBean classes in the Oracle and IBM JREs, and are already used by Micrometer to expose CPU metrics in the class io.micrometer.core.instrument.binder.system.ProcessorMetrics.

My use case would be to measure my application running inside a container in Kubernetes. With these metrics I would be able to know the percentage of available memory being used by the application.

The io.micrometer.core.instrument.binder.jvm.JvmMemoryMetrics is not enough to do it. For instance, by default Metaspace size is unlimited and Micrometer returns -1 when trying to read it.

The most important metric missing is getTotalPhysicalMemorySize, without it there's no way to build a Grafana panel that shows the percentage of memory being used by an application.

shakuzen commented 3 months ago

Thanks for opening the issue. It's surprising no one else has asked for this yet (as far as I can remember). I suppose it isn't application metrics but rather system metrics and a solution could be to get that from outside of your application somehow. I think it'd be best to add this in a separate binder than JvmMemoryMetrics. Perhaps in a SystemMemoryMetrics class. Though from the JavaDocs of the JMX methods, it sounds like they're trying to move away from the "system" naming. Would you be interested in contributing a pull request for this?

mweirauch commented 3 months ago

I haven't looked into the details what the JDK specific implementations return, but for the purpose of retrieving system level process metrics I came up with https://github.com/mweirauch/micrometer-jvm-extras which uses procfs (Linux only) to retrieve these metrics. (vss,rss,swap)

I started a draft to also read cgroup memory (limit) metrics, but haven't finished the implementation.

Perhaps that is useful.

fabcmartins commented 2 months ago

Would you be interested in contributing a pull request for this?

Sure, I'm writing the code. I'm planning on exposing the 5 metrics available in on the following properties:

As you already mentioned, Oracle seems to be moving from the term system to environment, maybe I should use a different prefix.

lenin-jaganathan commented 2 months ago

It's surprising no one else has asked for this yet (as far as I can remember). I suppose it isn't application metrics but rather system metrics and a solution could be to get that from outside of your application somehow.

There has been a bit of debate for us about where the responsibility of exposing the system-level metrics lies. Should individual processes report these or should the host (probably some host agent) should be responsible for exposing this information? There are discussions in favor of both sides.