eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.27k stars 721 forks source link

harmony -Xgcpolicy:metronome ASSERTION FAILED ** at RealtimeMarkingScheme.cpp:167: ((false && (_realtimeGC->_workPackets->isAllPacketsEmpty()))) #20017

Open pshipton opened 4 weeks ago

pshipton commented 4 weeks ago

Internal build [AIX] 80 Load_Level_2.harmony.5mins.Mode301 - -Xgcpolicy:metronome -Xnocompressedrefs Note 32-bit aix72p9-10

50x grinder - passed

j> 13:53:15 20240819 13:53:14 Runtime State Reporter IMPORTANT: 251178 tests complete, 446 currently running
j> 13:53:22 17:53:21.920 0x33d1f600    j9mm.107    *   ** ASSERTION FAILED ** at RealtimeMarkingScheme.cpp:167: ((false && (_realtimeGC->_workPackets->isAllPacketsEmpty())))
pshipton commented 4 weeks ago

@dmitripivkine fyi

dmitripivkine commented 4 weeks ago

There is no stored results unfortunately, so nothing to investigate yet.

pshipton commented 4 weeks ago

Ya it's weird. The job reports the following, but there is nothing. Attempting to upload compressed file.Upload appears to have been successful.

You could try more grinder, and also on the same machine where it failed.

dmitripivkine commented 4 weeks ago

grinders 10 jobs on the same machine/130 jobs all machines all passed.

dmitripivkine commented 3 weeks ago

Running more grinders. However I do not expect this problem to be reproduced. It is very unlikely that there is a regression, this code has not been touched for years. Memory corruption or machine threads sync problem is possible.

dmitripivkine commented 3 weeks ago

Another set of grinders (100 jobs across all machines and 100 jobs on the same machine) has passed. Looks like it is hard to reproduce, we are going to wait when is failed again in testing.

dmitripivkine commented 3 weeks ago

Looking to the code closely there is no logical possibility for GC threads to discover unmarked object and request work packet for it (reason for triggering the assertion). However new object can be added by non-GC thread by calling Write Barrier. This code is executed under Exclusive access (STW), so there is no valid way for Write Barrier to be called. This scenario can be investigated - is there possibility for mutator thread without VMAccess or Compilation thread to call Write Barrier. Lost threads synchronization and memory corruption still be potential scenarios as well.