grafana / pyroscope-java

pyroscope java integration
Apache License 2.0
72 stars 31 forks source link

SIGBUS on instrumentation on M2 CPUs #133

Closed adrianlyjak closed 6 months ago

adrianlyjak commented 6 months ago

We recently added the pyroscope java agent to the application my company runs. Developers running M2 macs have reported that the instrumentation crashes their JVM (Java 17) with dumps like this

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0xa) at pc=0x0000000147d354e4, pid=18543, tid=215043
#
# JRE version: OpenJDK Runtime Environment Temurin-17.0.7+7 (17.0.7+7) (build 17.0.7+7)
# Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.7+7 (17.0.7+7, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-aarch64)
# Problematic frame:
# v  ~StubRoutines::SafeFetch32
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/name/blah/blah/hs_err_pid18543.log
Compiled method (n/a)   54372  822     n 0       java.io.UnixFileSystem::getBooleanAttributes0 (native)
 total in heap  [0x00000001482ffb90,0x00000001482ffef8] = 872
 relocation     [0x00000001482ffce8,0x00000001482ffd10] = 40
 main code      [0x00000001482ffd40,0x00000001482ffef0] = 432
 metadata       [0x00000001482ffef0,0x00000001482ffef8] = 8
[thread 266499 also had an error]
Compiled method (n/a)   54374  822     n 0       java.io.UnixFileSystem::getBooleanAttributes0 (native)
 total in heap  [0x00000001482ffb90,0x00000001482ffef8] = 872
 relocation     [0x00000001482ffce8,0x00000001482ffd10] = 40
 main code      [0x00000001482ffd40,0x00000001482ffef0] = 432
 metadata       [0x00000001482ffef0,0x00000001482ffef8] = 8
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

Interestingly, I'm running an M1 mac, and it works fine.

Let me know if there's more information I can provide that would be helpful

korniltsev commented 6 months ago

This looks like a jvm bug. More info here https://github.com/async-profiler/async-profiler/issues/747

It was fixed both in async-profiler (but seems not released since then) and jvm 17.0.9

I suggest you update jvm, it may solve the problem

adrianlyjak commented 6 months ago

Thanks, I'll try it out

adrianlyjak commented 6 months ago

Awesome that fixes the issue. Thanks for determining the cause!