Open stevenshim opened 6 months ago
Hi @stevenshim - apologies for the inconvenience caused here and thanks for reporting the error. Could you give a little more information please?
/opt/goods-detail-api/hs_err_pid1.log
and the siginfo line just below it@stevenshim I suspect that you must be running the live heap profiler (DD_PROFILING_DDPROF_LIVEHEAP_ENABLED: true
or -Ddd.profiling.ddprof.liveheap.enabled=true
) - disabling this should be enough until we have a fix.
Hello @stevenshim - the issue was identified and a fix is in progress (https://github.com/DataDog/java-profiler/pull/88)
Thanks again for reporting this!
Is that issue happening only when Ddd.profiling.ddprof.liveheap.enabled=true
? I've got 2 JVM crashes a few hours after upgrading to 1.33.0 that also say The crash happened outside the Java Virtual Machine in native code.
(there's not much more detail though), but we didn't enable this flag.
I was able to replicate and save a crash dump. Maybe it's a different issue.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f5d72c0a43b, pid=1, tid=165464
#
# JRE version: OpenJDK Runtime Environment Temurin-21.0.3+9 (21.0.3+9) (build 21.0.3+9-LTS)
# Java VM: OpenJDK 64-Bit Server VM Temurin-21.0.3+9 (21.0.3+9-LTS, mixed mode, sharing, tiered, compressed class ptrs, z gc, linux-amd64)
# Problematic frame:
# C [ld-musl-x86_64.so.1+0x5943b] strlen+0xd
#
# Core dump will be written. Default location: //core.1
#
# JFR recording file will be written. Location: //hs_err_pid1.jfr
#
# If you would like to submit a bug report, please visit:
# https://github.com/adoptium/adoptium-support/issues
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
--------------- S U M M A R Y ------------
Command Line: -XX:MaxRAMPercentage=70.0 -XX:InitialRAMPercentage=70.0 -XX:-UseCompressedOops -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/coredump -XX:OnError=/app/scripts-deploy/jvm-crash-handler.sh -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=9900 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.rmi.port=9900 -Djava.rmi.server.hostname=172.23.81.106 -javaagent:/app/external/dd-java-agent.jar -Ddd.profiling.enabled=true -Ddd.integration.zio.experimental.enabled=true -XX:+PrintCodeCache -XX:ProfiledCodeHeapSize=512m -XX:NonProfiledCodeHeapSize=1024m -XX:NonNMethodCodeHeapSize=32m -XX:ReservedCodeCacheSize=1568m -XX:+UseZGC -XX:+ZGenerational -Duser.timezone=Asia/Seoul -Dio.netty.allocator.numHeapArenas=0 -Ddd.integration.throwables.enabled=false --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.sql/java.sql=ALL-UNNAMED com.devsisters.ck.CkMain
Host: Intel(R) Xeon(R) Platinum 8488C, 16 cores, 110G, Alpine Linux v3.19
Time: Thu May 9 13:38:28 2024 KST elapsed time: 76054.516656 seconds (0d 21h 7m 34s)
--------------- T H R E A D ---------------
Current thread (0x00007f571bf42040): JavaThread "SIGTERM handler" daemon [_thread_in_native, id=165464, stack(0x00007f56bd317000,0x00007f56bd417a78) (1026K)]
Stack: [0x00007f56bd317000,0x00007f56bd417a78], sp=0x00007f56bd416748, free space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [ld-musl-x86_64.so.1+0x5943b] strlen+0xd
C [libjavaProfiler11068430143537826157.so+0x29217] Recording::writeSettings(Buffer*, Arguments&)+0xd7
C [libjavaProfiler11068430143537826157.so+0x35a49] Recording::switchChunk(int)+0xf9
C [libjavaProfiler11068430143537826157.so+0x35b2f] FlightRecorder::dump(char const*, int)+0x6f
C [libjavaProfiler11068430143537826157.so+0x457d0] Profiler::dump(char const*, int)+0x200
C [libjavaProfiler11068430143537826157.so+0x245c7] Java_com_datadoghq_profiler_JavaProfiler_dump0+0x47
J 323292 com.datadoghq.profiler.JavaProfiler.dump0(Ljava/lang/String;)V (0 bytes) @ 0x00007f5d31c718ff [0x00007f5d31c71820+0x00000000000000df]
J 323289 c1 com.datadog.profiling.ddprof.DatadogProfilerRecording.snapshot(Ljava/time/Instant;Ldatadog/trace/api/profiling/ProfilingSnapshot$Kind;)Ldatadog/trace/api/profiling/RecordingData; (44 bytes) @ 0x00007f5d0b178be4 [0x00007f5d0b1788a0+0x0000000000000344]
J 323287 c1 com.datadog.profiling.controller.ddprof.DatadogProfilerOngoingRecording.snapshot(Ljava/time/Instant;Ldatadog/trace/api/profiling/ProfilingSnapshot$Kind;)Ldatadog/trace/api/profiling/RecordingData; (16 bytes) @ 0x00007f5d0a72d8d4 [0x00007f5d0a72d6e0+0x00000000000001f4]
J 260563 c1 com.datadog.profiling.agent.CompositeController$CompositeOngoingRecording$$Lambda+0x00007f58c81d9338.apply(Ljava/lang/Object;)Ljava/lang/Object; (16 bytes) @ 0x00007f5d1469d84c [0x00007f5d1469d5e0+0x000000000000026c]
J 4274 c2 java.util.stream.ReferencePipeline$3$1.accept(Ljava/lang/Object;)V java.base@21.0.3 (23 bytes) @ 0x00007f5d28ec3d18 [0x00007f5d28ec3ca0+0x0000000000000078]
J 47514 c2 java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Ljava/util/function/Consumer;)V java.base@21.0.3 (127 bytes) @ 0x00007f5d2b963e38 [0x00007f5d2b963d40+0x00000000000000f8]
J 32061 c2 java.util.stream.ReferencePipeline.collect(Ljava/util/stream/Collector;)Ljava/lang/Object; java.base@21.0.3 (124 bytes) @ 0x00007f5d2a9254ec [0x00007f5d2a925180+0x000000000000036c]
J 323261 c1 com.datadog.profiling.agent.CompositeController$CompositeOngoingRecording.snapshot(Ljava/time/Instant;Ldatadog/trace/api/profiling/ProfilingSnapshot$Kind;)Ldatadog/trace/api/profiling/RecordingData; (12 bytes) @ 0x00007f5d0a4c4754 [0x00007f5d0a4c4000+0x0000000000000754]
J 323260 c1 com.datadog.profiling.controller.ProfilingSystem$SnapshotRecording.snapshot(Z)V (144 bytes) @ 0x00007f5d07f3b784 [0x00007f5d07f3b540+0x0000000000000244]
j com.datadog.profiling.controller.ProfilingSystem.shutdown(Z)V+29
j com.datadog.profiling.agent.ProfilingAgent.shutdown(Lcom/datadog/profiling/controller/ProfilingSystem;Lcom/datadog/profiling/uploader/ProfileUploader;Z)V+17
j com.datadog.profiling.agent.ProfilingAgent.shutdown(Z)V+7
j java.lang.invoke.LambdaForm$DMH+0x00007f58cc09e800.invokeStatic(Ljava/lang/Object;I)V+10 java.base@21.0.3
j java.lang.invoke.LambdaForm$MH+0x00007f58cc09f800.invoke(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+33 java.base@21.0.3
J 45127 c2 java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; java.base@21.0.3 (108 bytes) @ 0x00007f5d2b6454b8 [0x00007f5d2b644f80+0x0000000000000538]
j datadog.trace.bootstrap.Agent.shutdownProfilingAgent(Z)V+64
j datadog.trace.bootstrap.Agent.shutdown(Z)V+15
j datadog.trace.bootstrap.instrumentation.shutdown.ShutdownHelper.shutdownAgent()V+11
j java.lang.Shutdown.runHooks()V+0 java.base@21.0.3
j java.lang.Shutdown.exit(I)V+12 java.base@21.0.3
j java.lang.Terminator$1.handle(Ljdk/internal/misc/Signal;)V+8 java.base@21.0.3
j jdk.internal.misc.Signal$1.run()V+8 java.base@21.0.3
j java.lang.Thread.runWith(Ljava/lang/Object;Ljava/lang/Runnable;)V+5 java.base@21.0.3
j java.lang.Thread.run()V+19 java.base@21.0.3
v ~StubRoutines::call_stub 0x00007f5d26ccbcbf
V [libjvm.so+0x912765] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*)+0x2e5
V [libjvm.so+0x9140ab] JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, JavaThread*)+0x1cb
V [libjvm.so+0x9ee64a] thread_entry(JavaThread*, JavaThread*)+0x8a
V [libjvm.so+0x928ccf] JavaThread::thread_main_inner() [clone .part.0]+0xaf
V [libjvm.so+0xf82c98] Thread::call_run()+0xa8
V [libjvm.so+0xd0b23a] thread_native_entry(Thread*)+0x12a
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 323292 com.datadoghq.profiler.JavaProfiler.dump0(Ljava/lang/String;)V (0 bytes) @ 0x00007f5d31c71887 [0x00007f5d31c71820+0x0000000000000067]
J 323289 c1 com.datadog.profiling.ddprof.DatadogProfilerRecording.snapshot(Ljava/time/Instant;Ldatadog/trace/api/profiling/ProfilingSnapshot$Kind;)Ldatadog/trace/api/profiling/RecordingData; (44 bytes) @ 0x00007f5d0b178be4 [0x00007f5d0b1788a0+0x0000000000000344]
J 323287 c1 com.datadog.profiling.controller.ddprof.DatadogProfilerOngoingRecording.snapshot(Ljava/time/Instant;Ldatadog/trace/api/profiling/ProfilingSnapshot$Kind;)Ldatadog/trace/api/profiling/RecordingData; (16 bytes) @ 0x00007f5d0a72d8d4 [0x00007f5d0a72d6e0+0x00000000000001f4]
J 260563 c1 com.datadog.profiling.agent.CompositeController$CompositeOngoingRecording$$Lambda+0x00007f58c81d9338.apply(Ljava/lang/Object;)Ljava/lang/Object; (16 bytes) @ 0x00007f5d1469d84c [0x00007f5d1469d5e0+0x000000000000026c]
J 4274 c2 java.util.stream.ReferencePipeline$3$1.accept(Ljava/lang/Object;)V java.base@21.0.3 (23 bytes) @ 0x00007f5d28ec3d18 [0x00007f5d28ec3ca0+0x0000000000000078]
J 47514 c2 java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Ljava/util/function/Consumer;)V java.base@21.0.3 (127 bytes) @ 0x00007f5d2b963e38 [0x00007f5d2b963d40+0x00000000000000f8]
J 32061 c2 java.util.stream.ReferencePipeline.collect(Ljava/util/stream/Collector;)Ljava/lang/Object; java.base@21.0.3 (124 bytes) @ 0x00007f5d2a9254ec [0x00007f5d2a925180+0x000000000000036c]
J 323261 c1 com.datadog.profiling.agent.CompositeController$CompositeOngoingRecording.snapshot(Ljava/time/Instant;Ldatadog/trace/api/profiling/ProfilingSnapshot$Kind;)Ldatadog/trace/api/profiling/RecordingData; (12 bytes) @ 0x00007f5d0a4c4754 [0x00007f5d0a4c4000+0x0000000000000754]
J 323260 c1 com.datadog.profiling.controller.ProfilingSystem$SnapshotRecording.snapshot(Z)V (144 bytes) @ 0x00007f5d07f3b784 [0x00007f5d07f3b540+0x0000000000000244]
j com.datadog.profiling.controller.ProfilingSystem.shutdown(Z)V+29
j com.datadog.profiling.agent.ProfilingAgent.shutdown(Lcom/datadog/profiling/controller/ProfilingSystem;Lcom/datadog/profiling/uploader/ProfileUploader;Z)V+17
j com.datadog.profiling.agent.ProfilingAgent.shutdown(Z)V+7
j java.lang.invoke.LambdaForm$DMH+0x00007f58cc09e800.invokeStatic(Ljava/lang/Object;I)V+10 java.base@21.0.3
j java.lang.invoke.LambdaForm$MH+0x00007f58cc09f800.invoke(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+33 java.base@21.0.3
J 45127 c2 java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; java.base@21.0.3 (108 bytes) @ 0x00007f5d2b6454b8 [0x00007f5d2b644f80+0x0000000000000538]
j datadog.trace.bootstrap.Agent.shutdownProfilingAgent(Z)V+64
j datadog.trace.bootstrap.Agent.shutdown(Z)V+15
j datadog.trace.bootstrap.instrumentation.shutdown.ShutdownHelper.shutdownAgent()V+11
j java.lang.Shutdown.runHooks()V+0 java.base@21.0.3
j java.lang.Shutdown.exit(I)V+12 java.base@21.0.3
j java.lang.Terminator$1.handle(Ljdk/internal/misc/Signal;)V+8 java.base@21.0.3
j jdk.internal.misc.Signal$1.run()V+8 java.base@21.0.3
j java.lang.Thread.runWith(Ljava/lang/Object;Ljava/lang/Runnable;)V+5 java.base@21.0.3
j java.lang.Thread.run()V+19 java.base@21.0.3
v ~StubRoutines::call_stub 0x00007f5d26ccbcbf
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x00007f58030185e4
Registers:
RAX=0x00007f58030185e4, RBX=0x000000000000218a, RCX=0x000000007265746c, RDX=0x0000000000000006
RSP=0x00007f56bd416748, RBP=0x00007f56bd416790, RSI=0x0000000000000006, RDI=0x00007f58030185e4
R8 =0x0000000000002182, R9 =0x0000000000002171, R10=0x0000000000000009, R11=0x0000000000000246
R12=0x00007f57fcd6a010, R13=0x00007f57fcd6a010, R14=0x0000000000002171, R15=0x00007f58030185e4
RIP=0x00007f5d72c0a43b, EFLAGS=0x0000000000010202, CSGSFS=0x002b000000000033, ERR=0x0000000000000004
TRAPNO=0x000000000000000e
Hi @ghostdogpr that looks like another issue, apologies for the inconvenience and thanks for the backtrace. I recommend downgrading until we've figured out the cause (we'll update you).
Agent Version dd-java-agent:1.33.0
Describe what happened The JVM intermittently crashes and logs the following error.
Additional environment details Java VM: OpenJDK 64-Bit Server VM Corretto-17.0.6.10.1 AWS EKS 1.25
Log
Hello, DataDog teams. I have been facing java process crashing after update dd-agent from 1.32.0 to 1.33.0.
I guess this prevention (https://github.com/DataDog/java-profiler/pull/84) might cause SIGSEGV. I already opened a issue on java-profiler project. (https://github.com/DataDog/java-profiler/issues/87)
I'm always grateful.