micrometer-metrics / micrometer

An application observability facade for the most popular observability tools. Think SLF4J, but for observability.
https://micrometer.io
Apache License 2.0
4.44k stars 978 forks source link

Why is the metric `jvm_gc_pause_seconds_count` missing in OpenJDK 11? #2837

Open peterpilgrim opened 2 years ago

peterpilgrim commented 2 years ago

This is probably not a bug more of an investigation.

I asked on Twitter, but realised quickly it was better to ask my questions here

@micrometerio Question for you: JvmGcMetrics.java and OpenJDK 11 + is there a reason, why "jvm.gc.pause" does not get reported anymore? I am trying to track down the actual JMX instrumentation records

For context, I am looking to connect to Prometheus so that Prom QL executes the following:

rate(jvm_gc_pause_seconds_sum{namespace="$namespace", instance="$instance"}[1m])/rate(jvm_gc_pause_seconds_count{namespace="$namespace", instance="$instance"}[1m])  

I don't see the metrics at all jvm_gc_pause_seconds_count generated in the controller / servlet output. So I conclude JvmGcMetrics.java is not picking up the GC JMX instrumentation beans ie. notifications at all.

My current code uses the current form

   new JvmGcMerics().bindTo(registry)

In JDK 11 the G1 collector is favoured (JDK 17 I already know this is the default), old CMS collectors are deprecated. I immediately wonder if this was simple case of the collector changing across the GC implementations / infrastructure and environment.

From reading code, JvmGcMetrics is looking into JMX Beans, can somebody point me to a decent example?

Thanks I know you are all busy guys (and gals).

shakuzen commented 2 years ago

The metric may not be present until the first gc pause happens after binding. Are you sure that a GC pause has happened after the bindTo method was called? You could force one via jconsole or programmatically with System.gc() to check.

peterpilgrim commented 2 years ago

Thanks for that idea. System.gc() worth a punt!

peterpilgrim commented 2 years ago

I put in a scheduled executor like this

        final ScheduledExecutorService executorService = Executors.newSingleThreadScheduledExecutor();
        executorService.scheduleAtFixedRate( ()-> {
            System.out.println("Execute System.gc()");
            System.gc();
        }, 60, 5, TimeUnit.SECONDS);

I don't see any jvm_gc_pause* in the output

peterpilgrim commented 2 years ago

Sorry it would help if I provided evidence:

# HELP jvm_threads_states_threads The current number of threads having NEW state
# TYPE jvm_threads_states_threads gauge
jvm_threads_states_threads{state="runnable",} 42.0
jvm_threads_states_threads{state="blocked",} 0.0
jvm_threads_states_threads{state="waiting",} 53.0
jvm_threads_states_threads{state="timed-waiting",} 48.0
jvm_threads_states_threads{state="new",} 0.0
jvm_threads_states_threads{state="terminated",} 0.0
# HELP jvm_memory_committed_bytes The amount of memory in bytes that is committed for the Java virtual machine to use
# TYPE jvm_memory_committed_bytes gauge
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'profiled nmethods'",} 5.6295424E7
jvm_memory_committed_bytes{area="heap",id="G1 Survivor Space",} 0.0
jvm_memory_committed_bytes{area="heap",id="G1 Old Gen",} 7.09885952E8
jvm_memory_committed_bytes{area="nonheap",id="Metaspace",} 3.62786816E8
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 2555904.0
jvm_memory_committed_bytes{area="heap",id="G1 Eden Space",} 1.54140672E8
jvm_memory_committed_bytes{area="nonheap",id="Compressed Class Space",} 5.2297728E7
jvm_memory_committed_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 1.7563648E7
# HELP system_cpu_count The number of processors available to the Java virtual machine
# TYPE system_cpu_count gauge
system_cpu_count 12.0
# HELP process_threads The number of process threads
# TYPE process_threads gauge
process_threads -1.0
# HELP jvm_gc_overhead_percent An approximation of the percent of CPU time used by GC activities over the last lookback period or since monitoring began, whichever is shorter, in the range [0..1]
# TYPE jvm_gc_overhead_percent gauge
jvm_gc_overhead_percent 0.0
# HELP process_memory_vss_bytes  
# TYPE process_memory_vss_bytes gauge
process_memory_vss_bytes -1.0
# HELP jvm_buffer_count_buffers An estimate of the number of buffers in the pool
# TYPE jvm_buffer_count_buffers gauge
jvm_buffer_count_buffers{id="mapped",} 41.0
jvm_buffer_count_buffers{id="direct",} 26.0
# HELP jvm_compilation_time_ms_total The approximate accumulated elapsed time spent in compilation
# TYPE jvm_compilation_time_ms_total counter
jvm_compilation_time_ms_total{compiler="HotSpot 64-Bit Tiered Compilers",} 92605.0
# HELP jvm_threads_daemon_threads The current number of live daemon threads
# TYPE jvm_threads_daemon_threads gauge
jvm_threads_daemon_threads 99.0
# HELP jvm_threads_peak_threads The peak live thread count since the Java virtual machine started or peak was reset
# TYPE jvm_threads_peak_threads gauge
jvm_threads_peak_threads 170.0
# HELP jvm_memory_usage_after_gc_percent The percentage of long-lived heap pool used after the last GC event, in the range [0..1]
# TYPE jvm_memory_usage_after_gc_percent gauge
jvm_memory_usage_after_gc_percent{area="heap",pool="long-lived",} 0.0
# HELP jvm_gc_max_data_size_bytes Max size of long-lived heap memory pool
# TYPE jvm_gc_max_data_size_bytes gauge
jvm_gc_max_data_size_bytes 2.68435456E9
# HELP jvm_threads_live_threads The current number of live threads including both daemon and non-daemon threads
# TYPE jvm_threads_live_threads gauge
jvm_threads_live_threads 143.0
# HELP process_uptime_seconds The uptime of the Java virtual machine
# TYPE process_uptime_seconds gauge
process_uptime_seconds 391.486
# HELP process_memory_rss_bytes  
# TYPE process_memory_rss_bytes gauge
process_memory_rss_bytes -1.0
# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'profiled nmethods'",} 5.56576E7
jvm_memory_used_bytes{area="heap",id="G1 Survivor Space",} 0.0
jvm_memory_used_bytes{area="heap",id="G1 Old Gen",} 5.04317128E8
jvm_memory_used_bytes{area="nonheap",id="Metaspace",} 3.1865936E8
jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 2398592.0
jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 0.0
jvm_memory_used_bytes{area="nonheap",id="Compressed Class Space",} 3.9483016E7
jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 1.7547264E7
# HELP disk_free_bytes Usable space for path
# TYPE disk_free_bytes gauge
disk_free_bytes{path="/",} 3.35237275648E11
# HELP disk_total_bytes Total space for path
# TYPE disk_total_bytes gauge
disk_total_bytes{path="/",} 4.99963174912E11
# HELP system_load_average_1m The sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
# TYPE system_load_average_1m gauge
system_load_average_1m 3.1240234375
# HELP jvm_gc_live_data_size_bytes Size of long-lived heap memory pool after reclamation
# TYPE jvm_gc_live_data_size_bytes gauge
jvm_gc_live_data_size_bytes 0.0
# HELP jvm_info JVM version info
# TYPE jvm_info gauge
jvm_info{runtime="OpenJDK Runtime Environment",vendor="AdoptOpenJDK",version="11.0.11+9",} 1.0
# HELP process_memory_swap_bytes  
# TYPE process_memory_swap_bytes gauge
process_memory_swap_bytes -1.0
# HELP jvm_buffer_memory_used_bytes An estimate of the memory that the Java virtual machine is using for this buffer pool
# TYPE jvm_buffer_memory_used_bytes gauge
jvm_buffer_memory_used_bytes{id="mapped",} 470265.0
jvm_buffer_memory_used_bytes{id="direct",} 129699.0
# HELP jvm_gc_memory_allocated_bytes_total Incremented for an increase in the size of the (young) heap memory pool after one GC to before the next
# TYPE jvm_gc_memory_allocated_bytes_total counter
jvm_gc_memory_allocated_bytes_total 0.0
# HELP process_start_time_seconds Start time of the process since unix epoch.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.634745803457E9
# HELP jvm_memory_max_bytes The maximum amount of memory in bytes that can be used for memory management
# TYPE jvm_memory_max_bytes gauge
jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'profiled nmethods'",} 1.22908672E8
jvm_memory_max_bytes{area="heap",id="G1 Survivor Space",} -1.0
jvm_memory_max_bytes{area="heap",id="G1 Old Gen",} 2.68435456E9
jvm_memory_max_bytes{area="nonheap",id="Metaspace",} -1.0
jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 5836800.0
jvm_memory_max_bytes{area="heap",id="G1 Eden Space",} -1.0
jvm_memory_max_bytes{area="nonheap",id="Compressed Class Space",} 1.073741824E9
jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 1.22912768E8
# HELP jvm_buffer_total_capacity_bytes An estimate of the total capacity of the buffers in this pool
# TYPE jvm_buffer_total_capacity_bytes gauge
jvm_buffer_total_capacity_bytes{id="mapped",} 470265.0
jvm_buffer_total_capacity_bytes{id="direct",} 129698.0
# HELP jvm_classes_unloaded_classes_total The total number of classes unloaded since the Java virtual machine has started execution
# TYPE jvm_classes_unloaded_classes_total counter
jvm_classes_unloaded_classes_total 318.0
# HELP jvm_gc_memory_promoted_bytes_total Count of positive increases in the size of the old generation memory pool before GC to after GC
# TYPE jvm_gc_memory_promoted_bytes_total counter
jvm_gc_memory_promoted_bytes_total 0.0
# HELP jvm_classes_loaded_classes The number of classes that are currently loaded in the Java virtual machine
# TYPE jvm_classes_loaded_classes gauge
jvm_classes_loaded_classes 52720.0
shakuzen commented 2 years ago

Thanks for trying that out. Too bad it didn't resolve it. Would you be able to provide a minimal sample with instructions to run that we could use to reproduce the issue? I tried running a sample application on Java 11 and I was able to get the jvm_gc_pause_seconds_count metric in the Prometheus endpoint.

peterpilgrim commented 2 years ago

I can't give you the application. I can tell you our environment

❯ java -version
openjdk version "11.0.11" 2021-04-20
OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed mode)

secret project on  feature/ACME-276 [$] via 🐳 default via ☕ v11.0.11 on ☁️  (eu-west-2) 
❯ uname -a
Darwin AMBGB000222 19.6.0 Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64 x86_64

I used this JVM properties

-Xmx2560m  -Dcom.sun.management.jmxremote=true  -Djava.rmi.server.hostname=127.0.0.1  -Dcom.sun.management.jmxremote.port=6006  -Dcom.sun.management.jmxremote.authenticate=false  -Dcom.sun.management.jmxremote.ssl=false  -Dcom.sun.management.jmxremote.local.only=false   -verbose:gc -Xloggc:/tmp/liferaygc1.log -XX:+PrintGCDetails  

So we are using the G1 garbade collector.

I would to have some example JMX bean code that reveals the collector information for the GC. I could then run in it a thread similar to this:


            final ScheduledExecutorService executorService = Executors.newSingleThreadScheduledExecutor();
            executorService.scheduleAtFixedRate( ()-> {
                System.out.printf("%s Execute System.gc()\n", new Date());
                System.gc();
            }, 15, 15, TimeUnit.SECONDS);

I just don't see these metrics reported.

shakuzen commented 2 years ago

No need to share your exact application, but if we can't otherwise figure it out, we will need some code that reproduces the issue, ideally in as minimal of a sample application as possible. Do you have the same issue if you make a new project that is as minimal as possible?

Is there anything in the logs like GC notifications will not be available?

SimSonic commented 2 years ago

Hi! I've created this issue but has been redirected to your project. Should I create separate bug or it seems to be bound with current issue?

shakuzen commented 2 years ago

@SimSonic I don't think it's related to this issue. Please make a separate issue. Though from a quick look, I would say to make sure that Micrometer is returning 0 values where you see the graph going to zero and it isn't failed scrapes from Prometheus combined with the graphs treating null as zero.

SimSonic commented 2 years ago

@shakuzen No, zeros are returned from /actuator/prometheus endpoint.

Ok, I'll create it later.

P.S. https://github.com/micrometer-metrics/micrometer/issues/2872

peterpilgrim commented 2 years ago

I don't see anything like GC Notifications will not be available at all in the logs.

/Users/m_agicuser/.sdkman/candidates/java/11.0.11.hs-adpt/bin/java -Dcatalina.base=/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17 -Dcatalina.home=/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17 -Dfile.encoding=UTF8 -Djava.io.tmpdir=/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17/temp -Djava.net.preferIPv4Stack=true -Djava.util.logging.config.file=/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Dorg.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES=false -Duser.timezone=GMT -Xmx2560m -Dcom.sun.management.jmxremote=true -Djava.rmi.server.hostname=127.0.0.1 -Dcom.sun.management.jmxremote.port=6006 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -verbose:gc -Xloggc:/tmp/liferaygc1.log -XX:+PrintGCDetails -javaagent:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=60538:/Applications/IntelliJ IDEA.app/Contents/bin -classpath /Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17/bin/bootstrap.jar:/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17/bin/tomcat-juli.jar:/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17/bin/commons-daemon.jar org.apache.catalina.startup.Bootstrap start
[0.005s][warning][gc] -Xloggc is deprecated. Will use -Xlog:gc:/tmp/liferaygc1.log instead.
[0.005s][warning][gc] -XX:+PrintGCDetails is deprecated. Will use -Xlog:gc* instead.
[0.014s][info   ][gc] Using G1
[0.666s][info   ][gc] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 25M->6M(256M) 5.214ms
23-Nov-2021 10:43:36.670 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [359] milliseconds
[0.881s][info   ][gc] GC(1) Pause Young (Concurrent Start) (Metadata GC Threshold) 22M->10M(256M) 4.772ms
[0.881s][info   ][gc] GC(2) Concurrent Cycle
[0.887s][info   ][gc] GC(2) Pause Remark 11M->11M(256M) 1.323ms
[0.889s][info   ][gc] GC(2) Pause Cleanup 11M->11M(256M) 0.061ms
[0.890s][info   ][gc] GC(2) Concurrent Cycle 9.138ms
[1.052s][info   ][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) 21M->11M(256M) 5.604ms
[1.514s][info   ][gc] GC(4) Pause Young (Normal) (G1 Evacuation Pause) 67M->24M(256M) 6.019ms
Loading jar:file:/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17/webapps/ROOT/WEB-INF/lib/portal-impl.jar!/system.properties
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.liferay.petra.reflect.ReflectionUtil (file:/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17/lib/ext/com.liferay.petra.reflect.jar) to method java.lang.Object.clone()
WARNING: Please consider reporting this to the maintainers of com.liferay.petra.reflect.ReflectionUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Loading jar:file:/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17/webapps/ROOT/WEB-INF/lib/portal-impl.jar!/portal.properties
Loading file:/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/portal-ext.properties
2021-11-23 10:43:38.524 INFO  [main][PortalContextLoaderListener:139] JVM arguments: -Dcatalina.base=/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17 -Dcatalina.home=/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17 -Dfile.encoding=UTF8 -Djava.io.tmpdir=/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17/temp -Djava.net.preferIPv4Stack=true -Djava.util.logging.config.file=/Users/m_agicuser/Projects/ACME/ned/ACME-analytics-tool-liferay/bundles/liferay-ce-portal-7.3.0-ga1/tomcat-9.0.17/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Dorg.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES=false -Duser.timezone=GMT -Xmx2560m -Dcom.sun.management.jmxremote=true -Djava.rmi.server.hostname=127.0.0.1 -Dcom.sun.management.jmxremote.port=6006 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -verbose:gc -Xloggc:/tmp/liferaygc1.log -XX:+PrintGCDetails -javaagent:/Applications/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=60538:/Applications/IntelliJ IDEA.app/Contents/bin
[2.779s][info   ][gc] GC(5) Pause Young (Normal) (G1 Evacuation Pause) 170M->29M(256M) 6.218ms
[2.803s][info   ][gc] GC(6) Pause Young (Normal) (G1 Evacuation Pause) 170M->29M(256M) 5.041ms
[2.820s][info   ][gc] GC(7) Pause Young (Normal) (G1 Evacuation Pause) 177M->29M(256M) 2.662ms
[2.837s][info   ][gc] GC(8) Pause Young (Normal) (G1 Evacuation Pause) 177M->29M(256M) 2.184ms
[2.853s][info   ][gc] GC(9) Pause Young (Normal) (G1 Evacuation Pause) 177M->29M(256M) 2.617ms
[3.105s][info   ][gc] GC(10) Pause Young (Normal) (G1 Evacuation Pause) 179M->43M(256M) 11.004ms
[4.449s][info   ][gc] GC(11) Pause Young (Concurrent Start) (Metadata GC Threshold) 182M->87M(308M) 33.203ms
[4.449s][info   ][gc] GC(12) Concurrent Cycle
[4.473s][info   ][gc] GC(12) Pause Remark 88M->87M(308M) 2.444ms
[4.489s][info   ][gc] GC(12) Pause Cleanup 88M->88M(308M) 0.092ms
[4.490s][info   ][gc] GC(12) Concurrent Cycle 41.566ms
2021-11-23 10:43:40.487 INFO  [main][DialectDetector:159] Using dialect org.hibernate.dialect.PostgreSQLDialect for PostgreSQL 13.3
2021-11-23 10:43:40.536 INFO  [main][DBInitUtil:171] Create tables and populate with default data

Everything looks tickety boo with these JVM arguments to our Liferay Application:

-Xmx2560m  -Dcom.sun.management.jmxremote=true  -Djava.rmi.server.hostname=127.0.0.1  -Dcom.sun.management.jmxremote.port=6006  -Dcom.sun.management.jmxremote.authenticate=false  -Dcom.sun.management.jmxremote.ssl=false  -Dcom.sun.management.jmxremote.local.only=false   -verbose:gc -Xloggc:/tmp/liferaygc1.log -XX:+PrintGCDetails  
jonatan-ivanov commented 2 years ago

@peterpilgrim Could you please create a minimal sample app that can repro the issue so that we can troubleshoot this?

peterpilgrim commented 2 years ago

This is a liferay application. It is proprietary, so for us it definitely the wrong approach. What I need a sample JMX example that I can run inside my application? Or point me to the correct JMX resources so I can dig deeper please.

jonatan-ivanov commented 2 years ago

I did not ask for your proprietary application, I asked for a new, minimal sample. A simple and minimal (maven/gradle) project that can reproduce the issue.

This is important and it can be as simple as new JvmGcMerics().bindTo(registry) in a psvm app in a new repo or a failing test in Micrometer or a new Sample in our samples, something that we can debug and narrow down the problem.

shakuzen commented 2 years ago

This is a liferay application.

Could you take a liferay sample application that is openly available, add Micrometer to it, and see if it has the same issue you're seeing in your application? I came across this liferay-blade-samples repo, but I'm not familiar with Liferay. If you can reproduce the issue by adding Micrometer to a sample like one of the ones in that repo and share instructions on how to run it, it will be much easier for us to investigate.

peterpilgrim commented 2 years ago

Ok ok. I will take a look at this. I might be awhile ...

apadigal commented 2 years ago

Any chance, this been resolved? I have the same issue that I can't see the pause count in micrometer Prometheus endpoint.

shakuzen commented 2 years ago

@apadigal if you can provide a sample application that we can run to reproduce the issue, we can probably find the cause, but so far we have not been able to reproduce it.

kuldeepjain commented 2 years ago

I have the same issue, don't see jvm_gc_pause_seconds_count. Don't have a minimal setup to reproduce it though.

jonatan-ivanov commented 2 years ago

Could you please try if the code below produces any outputs for you?

It needs Micrometer 1.9.0 (so we can ensure it is up to date) and it should print out 20+ lines if the metric is missing (so that we can ensure that a minimal sample works on your JVM).

If this sample produces any outputs, please let us know in the comments. (You might want to run it a few times if you see any outputs since System.gc() might not trigger the GC.)

If it does not produce any outputs, the metric is there and everything works as expected. If you have an app though where you can't see the metric, could you please provide us a minimal sample project in a GitHub repo that can reproduce this issue so we can investigate this further and ensure any fix is working properly for your use case?

import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
import java.util.Arrays;
import java.util.List;

import io.micrometer.core.instrument.binder.jvm.JvmGcMetrics;
import io.micrometer.core.instrument.simple.SimpleMeterRegistry;

public class Demo {
    public static void main(String[] args) throws InterruptedException {
        SimpleMeterRegistry registry = new SimpleMeterRegistry();
        new JvmGcMetrics().bindTo(registry);
        System.gc();
        Thread.sleep(3_000);

        if (!registry.getMetersAsString().contains("jvm.gc.pause")) {
            System.out.println(registry.getMetersAsString());

            System.getProperties().entrySet().stream()
                .filter(entry -> String.valueOf(entry.getKey()).contains("java"))
                .forEach(System.out::println);

            List<GarbageCollectorMXBean> gcMXBeans = ManagementFactory.getGarbageCollectorMXBeans();
            for (int i = 0; i < gcMXBeans.size(); i++) {
                GarbageCollectorMXBean gcMXBean = gcMXBeans.get(i);
                System.out.println("gc." + i + ".name=" + gcMXBean.getName());
                System.out.println("gc." + i + ".objectName=" + gcMXBean.getObjectName());
                System.out.println("gc." + i + ".collectionCount=" + gcMXBean.getCollectionCount());
                System.out.println("gc." + i + ".collectionTime=" + gcMXBean.getCollectionTime());
                System.out.println("gc." + i + ".memoryPoolNames=" + Arrays.toString(gcMXBean.getMemoryPoolNames()));
            }
        }
    }
}
peterpilgrim commented 2 years ago

I ran this version on my local MBP with OpenJDK Temurin

 11.0.14      | tem     | local only | 11.0.14-tem

My version of the program is:

package uk.co.xenonique.client.investigation;

import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
import java.util.Arrays;
import java.util.List;

import io.micrometer.core.instrument.binder.jvm.JvmGcMetrics;
import io.micrometer.core.instrument.simple.SimpleMeterRegistry;

public class InstrumentationGCPauseDemo {
    public static void main(String[] args) throws InterruptedException {
        SimpleMeterRegistry registry = new SimpleMeterRegistry();
        new JvmGcMetrics().bindTo(registry);

        System.out.println("======================================================================");
        System.out.println("Suggest running Garbage Collector");
        System.gc();
        System.out.println("sleep ....");
        Thread.sleep(3_000);
        System.out.println("======================================================================");
        System.out.println(registry.getMetersAsString());
        System.out.println("======================================================================");

        if (!registry.getMetersAsString().contains("jvm.gc.pause")) {

            System.out.println("Does not contain jvm.gc.pause");

            System.getProperties().entrySet().stream()
                    .filter(entry -> String.valueOf(entry.getKey()).contains("java"))
                    .forEach(System.out::println);

            List<GarbageCollectorMXBean> gcMXBeans = ManagementFactory.getGarbageCollectorMXBeans();
            for (int i = 0; i < gcMXBeans.size(); i++) {
                GarbageCollectorMXBean gcMXBean = gcMXBeans.get(i);
                System.out.println("gc." + i + ".name=" + gcMXBean.getName());
                System.out.println("gc." + i + ".objectName=" + gcMXBean.getObjectName());
                System.out.println("gc." + i + ".collectionCount=" + gcMXBean.getCollectionCount());
                System.out.println("gc." + i + ".collectionTime=" + gcMXBean.getCollectionTime());
                System.out.println("gc." + i + ".memoryPoolNames=" + Arrays.toString(gcMXBean.getMemoryPoolNames()));
            }
        }
    }
}

BTW I am compiling and building Micrometer v1.9.0

I get this output when I run the litte program InstrumentationGCPauseDemo

> Task :InstrumentationGCPauseDemo.main()
16:55:41.507 [main] DEBUG io.micrometer.core.util.internal.logging.InternalLoggerFactory - Using SLF4J as the default logging framework
======================================================================
Suggest running Garbage Collector
sleep ....
======================================================================
jvm.gc.live.data.size(GAUGE)[]; value=2074184.0 bytes
jvm.gc.max.data.size(GAUGE)[]; value=4.294967296E9 bytes
jvm.gc.memory.allocated(COUNTER)[]; count=7340032.0
jvm.gc.memory.promoted(COUNTER)[]; count=2074184.0
jvm.gc.pause(TIMER)[action='end of major GC', cause='System.gc()']; count=1.0, total_time=0.007 seconds, max=0.007 seconds
======================================================================

So obviously the metric is there. I guess is it pausing stuff. We need a longer running program.

I had another thought: is this a JVM flag issue? https://chriswhocodes.com/

jonatan-ivanov commented 2 years ago

@peterpilgrim Thank you for trying this out. In order to narrow down the difference between you app in prod and your local demo, I have a few ideas:

owaiscs001 commented 9 months ago

@jonatan-ivanov I'm facing the same issue I tried the above program and it produced an output. I'm using Micrometer 1.9.16. Maybe some JVM args are missing. Can you provide some help here, what to do next?

System.out.println(registry.getMetersAsString()); This is empty.

Other parameters are giving below value: i 1, gcMXBean.getName() G1 Old Generation, gcMXBean.getObjectName() java.lang:type=GarbageCollector,name=G1 Old Generation, gcMXBean.getCollectionTime() 12801, Arrays.toString(gcMXBean.getMemoryPoolNames()) [G1 Eden Space, G1 Survivor Space, G1 Old Gen]

shakuzen commented 9 months ago

@owaiscs001 ideally if you could provide a minimal sample application with instructions on how to run it to reproduce the issue, we could likely find the cause. Without that, it's not so easy to say. What JRE are you using? What arguments are you passing? Are you running the application in an application server? If so, which?

owaiscs001 commented 9 months ago

@shakuzen I have a spring boot application running on the Kubernetes cluster. Java(TM) SE Runtime Environment (build 17.0.9+11-LTS-201) java VM: HotSpot Spring boot version: 2.7.17

JVM parameters I pass in the application: -XX:+UseStringDeduplication -XX:+AlwaysPreTouch -Xloggc:/tmp/app_gc.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/app_heap_dump.bin

Since my application size is large and even needs many things to run a sample program. I was hoping to get something to try, which can help in narrowing the issue.

Any help here would be appreciated.

owaiscs001 commented 9 months ago

Also, I'm able to fetch only jvm_gc_overhead_percent metric. No other jvmgc* metric is present.

github-actions[bot] commented 8 months ago

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.