mgm3746 / garbagecat

Parses Java garbage collection logging and analyzes collectors, safepoint triggers, JVM version, JVM options, and OS information and reports error/warn/info level analysis and recommendations to support JVM tuning and troubleshooting for OpenJDK derivatives: (e.g. Adoptium, Azul, Microsoft, Oracle, Red Hat, etc.).
https://github.com/mgm3746/garbagecat/wiki
Eclipse Public License 1.0
211 stars 50 forks source link

UNIDENTIFIED LOG LINES JDK17 Corretto running G1 #203

Closed voytec0xD closed 7 months ago

voytec0xD commented 9 months ago

There were more that 1000, mostly 'debug', but also 'info' level.

[2024-02-01T11:09:50.737+0000][0.011s][debug][gc ] ConcGCThreads: 3 offset 29 [2024-02-01T11:09:50.737+0000][0.011s][debug][gc ] ParallelGCThreads: 13 [2024-02-01T11:09:50.737+0000][0.011s][debug][gc ] Initialize mark stack with 4096 chunks, maximum 524288 [2024-02-01T11:09:50.738+0000][0.011s][debug][gc,ergo,heap ] Expand the heap. requested expansion amount: 19327352832B expansion amount: 19327352832B [2024-02-01T11:09:50.761+0000][0.034s][debug][gc,heap,region] Activate regions [0, 1152) [2024-02-01T11:09:50.761+0000][0.034s][debug][gc,ihop ] Target occupancy update: old: 0B, new: 19327352832B [2024-02-01T11:09:50.761+0000][0.034s][debug][gc,ergo,refine] Initial Refinement Zones: green: 3328, yellow: 9984, red: 16640, min yellow size: 6656 [2024-02-01T11:09:50.761+0000][0.034s][debug][gc,task ] G1 Service Thread (Remembered Set Sampling Task) (register) [2024-02-01T11:09:50.761+0000][0.034s][debug][gc,task ] G1 Service Thread (Periodic GC Task) (register) [2024-02-01T11:09:50.761+0000][0.034s][info ][gc,init ] Version: 17.0.10+7-LTS (release) [2024-02-01T11:09:50.761+0000][0.035s][info ][gc,metaspace ] CDS archive(s) not mapped [2024-02-01T11:09:50.761+0000][0.035s][info ][gc,metaspace ] Compressed class space mapped at: 0x0000000800000000-0x0000000840000000, reserved size: 1073741824 [2024-02-01T11:09:50.788+0000][0.062s][debug][gc,task,start ] G1 Service Thread (Remembered Set Sampling Task) (run) [2024-02-01T11:09:50.788+0000][0.062s][debug][gc,task ] G1 Service Thread (Remembered Set Sampling Task) (run) 0.049ms (cpu: 0.000ms) [2024-02-01T11:09:50.788+0000][0.062s][debug][gc,task,start ] G1 Service Thread (Periodic GC Task) (run) [2024-02-01T11:09:50.789+0000][0.062s][debug][gc,task ] G1 Service Thread (Periodic GC Task) (run) 0.006ms (cpu: 0.000ms) [2024-02-01T11:09:51.038+0000][0.311s][debug][gc,task,start ] G1 Service Thread (Remembered Set Sampling Task) (run) [2024-02-01T11:09:51.038+0000][0.311s][debug][gc,task ] G1 Service Thread (Remembered Set Sampling Task) (run) 0.047ms (cpu: 0.000ms) [2024-02-01T11:09:51.338+0000][0.611s][debug][gc,task,start ] G1 Service Thread (Remembered Set Sampling Task) (run) [2024-02-01T11:09:51.338+0000][0.611s][debug][gc,task ] G1 Service Thread (Remembered Set Sampling Task) (run) 0.040ms (cpu: 0.000ms)

Run with java -jar ./target/garbagecat-4.0.2-SNAPSHOT.jar -p -j "-Xms18432m -Xmx18432m -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:+ParallelRefProcEnabled -XX:+PreserveFramePointer -XX:+UseG1GC -XX:+UseNUMA -XX:+UseStringDeduplication -XX:-AlwaysPreTouch -XX:CodeCacheExpansionSize=1048576 -XX:CompileThreshold=3000 -XX:ErrorFile=/somefolder/hs_err_pid%p.log -XX:HeapDumpPath=/somefolder -XX:InitialCodeCacheSize=512m -XX:ReservedCodeCacheSize=1024" <FILE>

GC log config: -Xlog:gc*=debug,safepoint,age*:file=/somepath/gclog.log:hostname,utctime,uptime,level,tags:filecount=10,filesize=1G (Note: I had to manually remove the hostname from the logs at that also caused parsing issues)

jdk17_gclog.zip

mgm3746 commented 9 months ago

Hi,

Thank you for reporting this. It will take a while to weed out the debug logging. So far, it looks like it's not affecting the analysis. I expect it will all end up as "throw-away" logging and is not impacting the analysis.

mgm3746 commented 9 months ago

I had to manually remove the hostname from the logs at that also caused parsing issues

What's the use case for including the hostname? Is it not sufficient to just include the hostname in the file name? For example, do you send all your logging to a logging aggregator, and then extract it based on hostname?

voytec0xD commented 9 months ago

Thanks for looking into it. If the results of the analysis are unaffected by those unidentified log lines then it's all good for us. As for the hostname I think we don't need it. It had just been there in the original GC log config and it caused the parsing to fail. Most likely due to introducing a new "[ ] column". So my goal here was to let you know that adding some log decorators can break the parsing.

czw., 8 lut 2024, 22:06 użytkownik Mike Millson @.***> napisał:

I had to manually remove the hostname from the logs at that also caused parsing issues

What's the use case for including the hostname? Is it not sufficient to just include the hostname in the file name? For example, do you send all your logging to a logging aggregator, and then extract it based on hostname?

— Reply to this email directly, view it on GitHub https://github.com/mgm3746/garbagecat/issues/203#issuecomment-1934932477, or unsubscribe https://github.com/notifications/unsubscribe-auth/BF5IUZRG3LH56HWRU4CHQIDYSU44DAVCNFSM6AAAAABC2BOBUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZUHEZTENBXG4 . You are receiving this because you authored the thread.Message ID: @.***>

mgm3746 commented 7 months ago

Can you provide clean logging? The logging you attached doesn't look like real-world logging.

1) The uptime is wrong. There is a "6" where there should be an "s". For example "1769.3366" when it should be "1769.336s":

[2024-02-01T12:12:09.096+0000][1769.3366][info ][safepoint ] Safepoint "Cleanup", Time since last: 1000107405 ns, Reaching safepoint: 96408 ns, Cleanup: 10096 ns, At safepoint: 22545 ns, Total: 129049 ns

2) There are thousands of thread dumps:

$ grep -E "ThreadDump" Users/wstolarz/Downloads/20240201_121217359896_i-0323651635d7d2aa1_jdk17_gclog.log | wc -l 1562

3) There are many instances of time warp, where 2 safepoint events overlap:

[2024-02-01T12:06:32.817+0000][1433.056s][info ][safepoint ] Safepoint "Cleanup", Time since last: 1000090093 ns, Reaching safepoint: 102491 ns, Cleanup: 14243 ns, At safepoint: 23817 ns, Total: 140551 ns [2024-02-01T12:06:32.823+0000][1433.056s][info ][safepoint ] Safepoint "ThreadDump", Time since last: 1066038 ns, Reaching safepoint: 48658 ns, Cleanup: 2997 ns, At safepoint: 4358126 ns, Total: 4409781 ns

I'm not sure if this is due to a stressed out JVM causing logging threading issues or mixed logging, but it's not normal for unified logging.

mgm3746 commented 7 months ago

I removed the ThreadDump safepoints and addressed all the debug logging patterns. Please test and let me know if you find any additional unrecognized debug logging. If you still see the TimeWarpException in clean logging, please open a separate issue for that.

mgm3746 commented 7 months ago

If you find any issues with the implementation, please re-open.