adoptium / adoptium-support

For end-user problems reported with our binary distributions
Apache License 2.0
46 stars 15 forks source link

SoftReferences not cleared before OOME #1096

Open devinrsmith opened 5 months ago

devinrsmith commented 5 months ago

Please provide a brief summary of the bug

The JVM throws an OutOfMemoryError without first cleaning reclaimable SoftReferences if it is concurrently doing computations that involve JNI critical regions (subject to multiple attempts wrt to -XX:GCLockerRetryAllocationCount). This seems to break one of the core tenants of SoftReference's JavaDoc:

All soft references to softly-reachable objects are guaranteed to have been cleared before the virtual machine throws an OutOfMemoryError

Did you test with the latest update version?

Please provide steps to reproduce where possible

https://github.com/devinrsmith/GCLockerTooOftenAllocating

I've tested across the latest 8, 11, 17, and 21 OpenJDKs. Most of the GC collectors I've tried exhibit this bug. It appears that ZGC and Shenandoah do not exhibit this bug.

Expected Results

The small reproducer program should complete successfully when SoftReferences are cleared.

seed=0, totalOutputSize=96107180
seed=1, totalOutputSize=96106767
seed=3, totalOutputSize=96107220
seed=2, totalOutputSize=96107158
Completed, count=15834

(specific values may differ).

Actual Results

[1.242s][warning][gc,alloc] main: Retried waiting for GCLocker too often allocating 524290 words
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.base/java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:71)
        at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:391)
        at io.deephaven.example.GCLockerTooOftenAllocating.main(GCLockerTooOftenAllocating.java:45)

What Java Version are you using?

openjdk 21.0.3 2024-04-16 LTS OpenJDK Runtime Environment Temurin-21.0.3+9 (build 21.0.3+9-LTS) OpenJDK 64-Bit Server VM Temurin-21.0.3+9 (build 21.0.3+9-LTS, mixed mode, sharing)

What is your operating system and platform?

Fedora 39 x86_64. Have also been able to reproduce an OS X aarch64.

How did you install Java?

Downloaded tar.gz for Adoptium LTS. Also able to reproduce using Azul builds installed via dnf/rpm.

Did it work before?

No response

Did you test with other Java versions?

No response

Relevant log output

No response

karianna commented 5 months ago

@devinrsmith Are you able to post your message to https://mail.openjdk.org/mailman/listinfo/hotspot-gc-use - I'm not sure this is the correct interpretation so would like to get the experts to comment there.

devinrsmith commented 5 months ago

Will do shortly, thanks.

devinrsmith commented 5 months ago

https://mail.openjdk.org/pipermail/hotspot-gc-use/2024-May/002938.html for those interested

devinrsmith commented 4 months ago

@karianna It doesn't seem like I've gotten on the radar of anybody from hotspot-gc-use; is there another mailing list that may be more receptive, or other actions I should take to try and elevate this?

karianna commented 4 months ago

@devinrsmith Sorry for the long delay, hotspot-gc-dev is your next best bet!

devinrsmith commented 4 months ago

No need to apologize; I appreciate your help. https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-June/048283.html for those interested.

devinrsmith commented 4 months ago

It's been noted as a special case of https://bugs.openjdk.org/browse/JDK-8192647.

github-actions[bot] commented 1 month ago

We are marking this issue as stale because it has not been updated for a while. This is just a way to keep the support issues queue manageable. It will be closed soon unless the stale label is removed by a committer, or a new comment is made.