Open zheng-kai opened 1 year ago
I am able to reproduce the problem on xLinux with latest Java8. Thank you to let us know.
I understand failure scenario. The only problem with JVM itself is missing validation of entered value for -Xgc:tlhIncrementSize
option. This option is used rarely, mostly for performance experiments.
Optthruput
operates with object heap as a one large Space. Gencon
, however, has heap split to two major Spaces - Tenure
and Nursery
.-Xgc:tlhIncrementSize
). There is no validation for entered value, so when very large number is provided first thread requested an increment took most of the object heap. Other threads struggle to allocate objects and call Garbage Collection often.Optthruput
(there is only one space) but for Nursery
only for Gencon
. Please note for Gencon
if wrong TLH increment size block object allocation in Nursery
practically, there is still be possibility to allocate in Tenure
. For Optthruput, however there is not a option.-Xgc:excessiveGCratio=80
option.So failure scenario is (with some simplifications):
During GC initialization one of java threads reaches point where requests TLH increment. There is no validation for entered TLH increment size it is taken blindly. As a result size of requested TLH is much larger than maximum TLH size and take almost all memory in the heap for it. It means there is almost none left for other threads. And because this is Optthruput
there is no other Space to try allocation. So, other threads struggle to allocate objects and call Garbage Collector. GC performes but with very little amount of memory to be freed. This operation repeats again and again until Excessive GC condition triggers OOM instead of next GC. This condition has reached before initialization of all GC Worker Threads is complete, so Failed to startup the Garbage Collector
message is printed.
There is nothing wrong except missed input validation for rarely used option. Put it to Deep Backlog to be fixed in the future
Thank you for your explanation, which has given me more understanding of GC.
Java -version output
openjdk version "1.8.0_362" IBM Semeru Runtime Open Edition (build 1.8.0_362-b09) Eclipse OpenJ9 VM (build openj9-0.36.0, JRE 1.8.0 Windows 11 amd64-64-Bit Compressed References 20230207_599 (JIT enabled, AOT enabled) OpenJ9 - e68fb241f OMR - f491bbf6f JCL - eebde685ec based on jdk8u362-b09)
Summary of problem
I found a problem by accident. When I use optthruput GC with following options, jvm could not be created.
But if I switch the GC, it's OK.
I would like to know why optthruput is the only one with exceptions. Thank you. And whether this information is sufficient to locate the problem?