adoptium / adoptium-support

For end-user problems reported with our binary distributions
Apache License 2.0
45 stars 15 forks source link

New JDK Process failure in GregorianCalendar #799

Closed bentatham closed 1 year ago

bentatham commented 1 year ago

Please provide a brief summary of the bug

When running test code on jruby, we are getting a failure we have never seen before the latest java update 17.0.7+7. This does not happen on every build, but does happen on most (about 75% of the runs so far)

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (macroAssembler_x86.cpp:864), pid=1273, tid=1274
#  fatal error: DEBUG MESSAGE: duplicated predicate failed which is impossible
#
# JRE version: OpenJDK Runtime Environment Temurin-17.0.7+7 (17.0.7+7) (build 17.0.7+7)
# Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.7+7 (17.0.7+7, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xafec21]  MacroAssembler::debug64(char*, long, long*)+0x41
#
# Core dump will be written. Default location: Core dumps may be processed with "/dev/null" (or dumping to /opt/atlassian/pipelines/agent/build/athena-server/core.1273)
#
# An error report file with more information is saved as:
# /opt/atlassian/pipelines/agent/build/athena-server/hs_err_pid1273.log
Compiled method (c2)  611002 60802       4       java.util.GregorianCalendar::computeTime (976 bytes)
 total in heap  [0x00007f0de1f88110,0x00007f0de1f89ef8] = 7656
 relocation     [0x00007f0de1f88270,0x00007f0de1f88340] = 208
 main code      [0x00007f0de1f88340,0x00007f0de1f892a0] = 3936
 stub code      [0x00007f0de1f892a0,0x00007f0de1f892f0] = 80
 metadata       [0x00007f0de1f892f0,0x00007f0de1f89398] = 168
 scopes data    [0x00007f0de1f89398,0x00007f0de1f89a60] = 1736
 scopes pcs     [0x00007f0de1f89a60,0x00007f0de1f89db0] = 848
 dependencies   [0x00007f0de1f89db0,0x00007f0de1f89dd8] = 40
 handler table  [0x00007f0de1f89dd8,0x00007f0de1f89e50] = 120
 nul chk table  [0x00007f0de1f89e50,0x00007f0de1f89ef8] = 168
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
#
Aborted

Please provide steps to reproduce where possible

It is unclear exactly where in our test code this is failing. It is using jruby, with sinatra, webbrick, and our own ruby code.

Expected Results

Our code continues running

Actual Results

The JVM is aborted and our test code fails.

What Java Version are you using?

openjdk version "17.0.7" 2023-04-18 OpenJDK Runtime Environment Temurin-17.0.7+7 (build 17.0.7+7) OpenJDK 64-Bit Server VM Temurin-17.0.7+7 (build 17.0.7+7, mixed mode, sharing)

What is your operating system and platform?

Docker on Bitbucket Pipelines.

How did you install Java?

Our docker image is based on maven:3.9.1-eclipse-temurin-17 which is based on eclipse-temurin:17-jdk.

Did it work before?

Yes. On 17.0.6, it worked reliably.

Did you test with the latest update version?

Yes - that is where it fails.

Did you test with other Java versions?

It worked in Java version: 17.0.6, vendor: Eclipse Adoptium

Relevant log output

I have the hs_err output, but it is too long to paste here.
karianna commented 1 year ago

@bentatham Can you attach the hs_err output?

bentatham commented 1 year ago

hs_err_pid1273.log

karianna commented 1 year ago

@bentatham There's a non-0 chance this is a JRuby interaction - can you upgrade to 9.4.2.0 and post here again if you get a crash log?

bentatham commented 1 year ago

Sadly, we're stuck on jruby 9.2 (because of unrelated issues with warbler). We might have to rollback to Java 11, sadly.

What's frustrating is that this issue only cropped up in the jdk 17.0.7, where it worked fine with jdk 17.0.6

karianna commented 1 year ago

I'd say for now stick to 17.0.6 until you can move up versions of JRuby. I assume it's not possible to try an upgrade in a test system?

bentatham commented 1 year ago

Thanks for you help. Sadly, it will be complicated. Our CI images use the maven base images, which have already pre-selected the JDK, but we can downgrade maven as well.

And sadly, because of issues with warbler, we can't build the war from our jruby app, so can't build, let alone upgrade, test systems either. The JDK failures here are during the unit tests, so we could skip that too, I suppose, but rather not obviously, and might run into similar JDK issues when deployed anyhow.

Maybe something will magically fix itself in the next JDK release. I'm tempted to try JDK19 or 20 and see there too.

tibco-jufernan commented 1 year ago

I've also been running into this duplicated predicate failed which is impossible issue with GregorianCalendar.computeTime on both 11.0.19+9 and 17.0.7+8. A minimal reproduction can be found here: https://bugs.openjdk.org/browse/JDK-8307683

jerboaa commented 1 year ago

@tibco-jufernan Thanks. According to the bug, the workaround is to disable compilation of affected method or use -XX:-UseLoopPredicate

bentatham commented 1 year ago

I'm glad there's a workaround. We updated to 20.0.1+9 and the problem seems to go away there as well, at least so far.

karianna commented 1 year ago

https://bugs.openjdk.org/browse/JDK-8308884 for the backports

gdams commented 1 year ago

The changes that broke this were reverted - closing