Open AlexeyKhrabrov opened 2 years ago
One possible simple workaround fix for this issue is to always delay relocation of remote AOT methods until at least the next invocation of the method. This is how relocation is already handled for JITServer AOT cache methods that use SVM (for a different reason), which has negligible impact on performance. PR #15148 implements this fix.
@mpirvu FYI
After more testing, it turns out that the issue is reproducible without disableDelayRelocation
as well. The segfault also happens with the Spring PetClinic benchmark, although even less frequently than with AcmeAir. For PetClinic, the affected methods (top of the Java call stack of crashing threads in the javacore) also include sun/nio/ch/Util.getTemporaryDirectBuffer
and sun/nio/ch/IOUtil.readIntoNativeBuffer
, which are also ByteBuffer
-related.
I haven't been able to reproduce the issue with the 0.32.0 release. It has probably been introduced since then.
Using JITServer with the
-Xjit:disableDelayRelocationForAOTCompilations
option can rarely (less than 1 in 100 runs) lead to segfaults in JIT-compiled code.The issue is reproducible with AcmeAir and (less frequently) DayTrader7. Affected methods (extracted from stack traces in javacore files) include
java/nio/DirectByteBuffer.put([BII)Ljava/nio/ByteBuffer;
andcom/ibm/ws/bytebuffer/internal/WsByteBufferImpl.copyToDirectBuffer()V
(which seems to inline the first one); there might be others. The segfault happens more frequently without AOT cache.-Xjit:disableDelayRelocationForAOTCompilations
is an undocumented option that is not commonly used and not covered by the tests. It is enabled by default with JITServer AOT cache because it results in better ramp-up performance.