Open cinquin opened 8 years ago
As you state, this looks like a JVM bug rather than an Honest-Profiler bug. It would be valuable to report it on the relevant OpenJDK mailing list and link here maybe? I'm vaguely familiar with the code in this area, but there are far more qualified people in the OpenJDK team.
@nitsanw Yes I think I'll report it to the HotSpot folks. It would be useful to know if the problem is specific to my OpenJDK build or platform (especially since I don't have the problem reduced to a simple test case, and don't have a very convenient way of trying to reproduce it on an officially-supported OS). Has anyone been using Honest Profiler with OpenJDK 9? How is it working for you?
I have not, I think its worth trying to reproduce the issue without honest profiler running to see if its just a pure JDK9 problem.
reproduce the issue without honest profiler running
The crashes only occur with Honest Profiler. (And the stack trace in my original post shows that the problem is in a method called by AsyncGetCallTrace
, which is itself of course called by Honest Profiler.)
My apologies - I was reading this thread too fast ;) +1 to @nitsanw 's suggestion of reporting it on the openjdk mailing list.
@cinquin if you can make Solaris Studio work on that platform you should see same issue, also with other AGCT profilers:
JMC might have same issue.
@nitsanw Yes I imagine I would see the same issue. But to get this fixed by the OpenJDK people it would probably help if the problem could be reproduced on an officially-supported OS; I'm not set up to do that right now.
Incidentally, it might be a good idea to add a set of "stress tests" to the tests that are already part of the project. The other types of crashes that I've fixed so far only occur when the timing is just right (or rather, just wrong) and I don't remember them being triggered by the test suite.
I have an issue that is possibly not addressable within Honest Profiler itself and that might be more suitable for the HotSpot development mailing list, but I thought this might be a good place to start and see if anyone can provide insights. I've been getting intermittent SIGSEGV crashes running (under FreeBSD) the BSD port of OpenJDK 9, roughly equivalent to early access build b117, and a version of Honest Profiler with the commit in my pull request #108 (which fixes a different sort of crash). The new crashes do not occur with OpenJDK 8.
An example trace is as follows:
The crash occurs in frame 23, which maps to the following line in
openjdk9/hotspot/src/cpu/x86/vm/frame_x86.cpp
:There seems to be a check just above that
fp
is "safe"; perhaps that test is deficient.This code seems to have changed between OpenJDK 8 and OpenJDK 9. I don't know enough about the way the JVM deals with frames to understand what the code is doing and how it ends with a return value for
fp()
that is invalid (if I remember correctly the value forreturn_addr_offset
was 1 according to GDB, and was thus unlikely to be the problem).Does anyone have insights as to what the problem might be, and how to fix it?