eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

JDK8 [zOS S390] 80 Load_Level_2.abbs.5mins.Mode103.5 - j9mm.141 * ** ASSERTION FAILED ** at ./MarkingScheme.hpp:71: ((false)) & j9mm.107 * ** ASSERTION FAILED ** #13905

Open JasonFengJ9 opened 2 years ago

JasonFengJ9 commented 2 years ago

Failure link

From an internal build job_output.php?id=19746935 (fyrec517):

java version "1.8.0_311"
Java(TM) SE Runtime Environment (build 8.0.7.0 - pmz3180sr7-20211011_01(SR7))
IBM J9 VM (build 2.9, JRE 1.8.0 z/OS s390-31-Bit 20211111_16455 (JIT enabled, AOT enabled)
OpenJ9   - 58668d5
OMR      - 3804658
IBM      - 3c151c1)
JCL - 20210930_01 based on Oracle jdk8u311-b11

Optional info

Failure output (captured from console output)

j> 03:31:14 20211111 03:31:13 Runtime State Reporter IMPORTANT: 3373 tests complete, 262 currently running
j> 03:31:16 Object 00000600 not in heap range [1C500000,3C500000)
j> 03:31:16 11:31:15.992 0x3d753500    j9mm.141    *   ** ASSERTION FAILED ** at ./MarkingScheme.hpp:71: ((false))
j> 03:31:16 11:31:16.024 0x464beb00    j9mm.107    *   ** ASSERTION FAILED ** at ../../../gc_glue_java/MarkingDelegate.hpp:121: ((false && ((UDATA)0x99669966 == clazz->eyecatcher)))
j> 03:31:16 JVMDUMP039I Processing dump event "traceassert", detail "" at 2021/11/11 03:31:15 - please wait.
j> 03:31:16 JVMDUMP039I Processing dump event "traceassert", detail "" at 2021/11/11 03:31:16 - please wait.
j> 03:31:16 JVMDUMP032I JVM requested System dump using 'J9BUILD.JVM.TDUMP.J9BUILD6.D211111.T033115' in response to an event

fyi @dmitripivkine

pshipton commented 2 years ago

6x grinders on failing machine - passed 140x grinders - passed

pshipton commented 2 years ago

This is also the internal issue 140133, there is more history of the failure there.

dmitripivkine commented 2 years ago

There is object at 0x21E528D0 size 0x18 (or 0x14) expected j9class to be at 0x49156100

0x21E528D0 :  49156100 00000000 25e6c810 00000000 [ I.a.....%....... ] <---
0x21E528E0 :  000001b5 000001b7 3d5a6800 0000000a [ ........=Zh..... ]

but there is no valid data (corrupted?):

0x49156100 :  00000000 00000000 00000000 00000000 [ ................ ]
0x49156110 :  00000000 00000000 00000000 00000000 [ ................ ]
0x49156120 :  00000000 00000000 3de21e00 00000000 [ ........=....... ]
0x49156130 :  22c54398 00000001 3de22000 00000000 [ ".C.....=. ..... ]
0x49156140 :  00000001 00000001 00000001 00000001 [ ................ ]
0x49156150 :  3de22f00 00000000 22c54268 22c54638 [ =./.....".Bh".F8 ]
0x49156160 :  49156d00 00000000 3d5ae001 00000000 [ I.m.....=Z...... ]
0x49156170 :  00000004 00000000 00000000 00000018 [ ................ ]
0x49156180 :  00000000 00000000 49156000 4883c540 [ ........I.`.H..@ ]
dmitripivkine commented 2 years ago

There is class at !j9class 0x49156000 sun/nio/ch/IOStatus so it is possible header of the object 0x21E528D0 has extra bit added somehow. But in this case totalInstanceSize should be 0x18 or 0x14 but this class has totalInstanceSize = 0x00000004 At least we can tell the class pointer 0x49156100 is invalid

dmitripivkine commented 2 years ago

GC Check also sees problems in near by addresses for classes:

Checking CLASS HEAP...  <gc check (1): from debugger: CLASS HEAP: Class 49156100: class header invalid>
  <gc check (2): from debugger: CLASS HEAP: Class 49156100: corrupt data exception>
  <gc check (3): from debugger: CLASS HEAP: constant slot 49156000(10) -> fd5418: class pointer not in a class segment>
dmitripivkine commented 2 years ago

I believe VM guys should take a look what is going on with this class(es) Test results are downloaded to /team/Dmitri/13905, core location ./results/home/j9build/tmp/bld_16455_zos_390-31/bld_16455/abbs.5mins/19746935/JLTF-jgrinder-abbs-mode103-20211111-0325/J9BUILD.JVM.TDUMP.J9BUILD6.D211111.T033115

JasonFengJ9 commented 2 years ago

This is also the internal issue 140133, there is more history of the failure there.

@dchopra001 does this latest occurrence seem same as the internal workitem 140133?

dmitripivkine commented 2 years ago

Sorry, I was not clear before - the fact GC Check discovering bogus class 0x49156100 means class heap structure (GC Check is walking) assumes class should be there. But it is not there and can not be due existing class at 0x49156000 So looks like there is some confusion in a class heap.

dchopra001 commented 2 years ago

@dchopra001 does this latest occurrence seem same as the internal workitem 140133?

@JasonFengJ9 I can't say for sure but it does look like it. There are similarities such as a low failure rate, the same assertion being hit, both issues reported failures on 31-bit platforms, and the same test. So I think we can focus on just the external one going forward and mark the internal one as a duplicate for now. (The only thing I'm not sure about is how common this triggered VM assertion is as I'm not familiar with this area of the codebase).

JasonFengJ9 commented 2 years ago

So I think we can focus on just the external one going forward and mark the internal one as a duplicate for now.

The internal workitem 140133 has been closed as dup of this issue.

dmitripivkine commented 2 years ago

There is a class heap corruption exposed in !allclasses output:

RAM classes (ram size loader rom replacement name)
Class addr  size        ClassLoader ROM class addr  Array   Class name

0x49156F00  0x000001D8  0x3CD49B18  0x49130790  0   sun/security/pkcs/PKCS7$1
0x49156D00  0x00000BD0  0x3CD49B18  0x4912C900  0   sun/security/pkcs/PKCS7

0x49156100  0x49156000  0x47F59600  0x00000168  0x3CD49C20  0x19949920  0   com/ibm/jtc/svt/streams/TestParallelStreamOperations$$Lambda$288/0x19949920

0x48E23C00  0x000001A8  0x3CD49C20  0x4730B320  0   com/ibm/jtc/svt/streams/TestParallelStreamOperations$$Lambda$287/0x4730b320

So

dmitripivkine commented 2 years ago

there is class we got link to 0x49156100:

> !j9class 0x49156D00
J9Class at 0x49156d00 {
  Fields for J9Class:
    0x0: UDATA eyecatcher = 0x99669966 (2573637990)
    0x4: struct J9ROMClass * romClass = !j9romclass 0x4912C900
    0x8: struct J9Class ** superclasses = !j9x 0x49156C3C
    0xc: UDATA classDepthAndFlags = 0x000E0001 (917505)
    0x10: U32 classDepthWithFlags = 0x00000000 (0)
    0x14: U32 classFlags = 0x00000000 (0)
    0x18: struct J9ClassLoader * classLoader = !j9classloader 0x3CD49B18
    0x1c: j9object_t classObject = !j9object 0x23DA2010 // java/lang/Class
    0x20: volatile UDATA initializeStatus = 0x00000001 (1)
    0x24: struct J9Method * ramMethods = !j9method 0x49156A7C // sun/security/pkcs/PKCS7.<init>(Ljava/io/InputStream;)V
    0x28: UDATA * ramStatics = !j9x 0x49156C58
    0x2c: struct J9Class * arrayClass = !j9class 0x00000000
    0x30: UDATA totalInstanceSize = 0x0000002C (44)
    0x34: struct J9ITable * lastITable = !j9itable 0x1A0FD0B8
    0x38: UDATA * instanceDescription = !j9x 0x000003FD
    0x3c: UDATA * instanceLeafDescription = !j9x 0x00000001
    0x40: UDATA instanceHotFieldDescription = 0x00000000 (0)
    0x44: UDATA selfReferencingField1 = 0x00000000 (0)
    0x48: UDATA selfReferencingField2 = 0x00000000 (0)
    0x4c: struct J9Method * initializerCache = !j9method 0x00000000
    0x50: UDATA romableAotITable = 0x1A3BECE0 (440134880)
    0x54: UDATA packageID = 0x4912C901 (1225967873)
    0x58: struct J9Module * module = !j9module 0x00000000
    0x5c: struct J9Class * subclassTraversalLink = !j9class 0x49156100 // <FAULT> <--------------
    0x60: struct J9Class * subclassTraversalReverseLink = !j9class 0x49156F00 // sun/security/pkcs/PKCS7$1
    0x64: void ** iTable = !j9x 0x00000000
    0x68: UDATA castClassCache = 0x3D5AE001 (1029365761)
    0x6c: void ** jniIDs = !j9x 0x00000000
    0x70: UDATA lockOffset = 0x00000004 (4)
    0x74: U16 reservedCounter = 0x0000 (0)
    0x76: U16 cancelCounter = 0x0000 (0)
    0x78: UDATA newInstanceCount = 0x00000000 (0)
    0x7c: IDATA backfillOffset = 0x0000002C (44)
    0x80: struct J9Class * replacedClass = !j9class 0x00000000
    0x84: UDATA finalizeLinkOffset = 0x00000000 (0)
    0x88: struct J9Class * nextClassInSegment = !j9class 0x49156100 // <FAULT> <--------------
    0x8c: UDATA * ramConstantPool = !j9x 0x491561E0
    0x90: j9object_t * callSites = !j9x 0x00000000
    0x94: j9object_t * methodTypes = !j9x 0x00000000
    0x98: j9object_t * varHandleMethodTypes = !j9x 0x00000000
    0x9c: struct J9VMCustomSpinOptions * customSpinOption = !j9vmcustomspinoptions 0x00000000
    0xa0: struct J9Method ** staticSplitMethodTable = !j9x 0x00000000
    0xa4: struct J9Method ** specialSplitMethodTable = !j9x 0x00000000
    0xa8: struct J9JITExceptionTable * jitMetaDataList = !j9jitexceptiontable 0x00000000
    0xac: struct J9Class * gcLink = !j9class 0x00000000
    0xb0: struct J9Class * hostClass = !j9class 0x49156D00 // sun/security/pkcs/PKCS7
    0xb4: struct J9FlattenedClassCache * flattenedClassCache = !j9flattenedclasscache 0x00000000
    0xb8: struct J9ClassHotFieldsInfo * hotFieldsInfo = !j9classhotfieldsinfo 0x00000000
}
Class name: sun/security/pkcs/PKCS7
To view static fields, use !j9statics 0x49156D00
To view instance shape, use !j9classshape 0x49156D00

Looks like there was some problem at the time of !j9class 0x49156100 creation

pshipton commented 2 years ago

/job_output.php?id=21789704 [zOS S390] 80 Load_Level_2.abbs.5mins.Mode103 fyrec607

j> 03:36:44 Object 3CFBD200 not in heap range [1C500000,3C500000)
j> 03:36:44 11:36:44.051 0x3f577300    j9mm.141    *   ** ASSERTION FAILED ** at ./MarkingScheme.hpp:71: ((false))
dmitripivkine commented 2 years ago

@tajila FYI

core for /job_output.php?id=21789704 is stored in /team/Dmitri/13905/1/results/home/j9build/tmp/bld_17713_zos_390-31/bld_17713/abbs.5mins/21789704/JLTF-jgrinder-abbs-mode103-20211203-0330/J9BUILD.JVM.TDUMP.J9BUILD5.D211203.T033644

Checking CLASS HEAP...  <gc check (1): from debugger: CLASS HEAP: constant slot 3f7a9d00(3f7a9c48) -> 3cfbd200: object slot appears to contain a J9Class pointer>

Class !j9class 0x3CFBD200 ([C) has been discovered instead of object pointer in Constant Pool of !j9class 0x3f7a9d00 (com/ibm/jtc/svt/lambda/support/Animal):

> !j9class 0x3f7a9d00
J9Class at 0x3f7a9d00 {
  Fields for J9Class:
    0x0: UDATA eyecatcher = 0x99669966 (2573637990)
    0x4: struct J9ROMClass * romClass = !j9romclass 0x3F831520
    0x8: struct J9Class ** superclasses = !j9x 0x3F7A9A30
    0xc: UDATA classDepthAndFlags = 0x001E0001 (1966081)
    0x10: U32 classDepthWithFlags = 0x00000000 (0)
    0x14: U32 classFlags = 0x00000000 (0)
    0x18: struct J9ClassLoader * classLoader = !j9classloader 0x3CD49C20
    0x1c: j9object_t classObject = !j9object 0x1C797F48 // java/lang/Class
    0x20: volatile UDATA initializeStatus = 0x00000001 (1)
    0x24: struct J9Method * ramMethods = !j9method 0x3F7A9C94 // <FAULT>
    0x28: UDATA * ramStatics = !j9x 0x00000000
    0x2c: struct J9Class * arrayClass = !j9class 0x00000000
    0x30: UDATA totalInstanceSize = 0x0000000C (12)
    0x34: struct J9ITable * lastITable = !j9itable 0x1A0EC108
    0x38: UDATA * instanceDescription = !j9x 0x00000005
    0x3c: UDATA * instanceLeafDescription = !j9x 0x00000001
    0x40: UDATA instanceHotFieldDescription = 0x00000000 (0)
    0x44: UDATA selfReferencingField1 = 0x00000000 (0)
    0x48: UDATA selfReferencingField2 = 0x00000000 (0)
    0x4c: struct J9Method * initializerCache = !j9method 0x00000000
    0x50: UDATA romableAotITable = 0x1A3ADCE0 (440065248)
    0x54: UDATA packageID = 0x3F831521 (1065555233)
    0x58: struct J9Module * module = !j9module 0x00000000
    0x5c: struct J9Class * subclassTraversalLink = !j9class 0x3F7AA100 // com/ibm/jtc/svt/lambda/support/Person
    0x60: struct J9Class * subclassTraversalReverseLink = !j9class 0x3F7AA600 // [Lcom/ibm/jtc/svt/lambda/support/Person$Title;
    0x64: void ** iTable = !j9x 0x00000000
    0x68: UDATA castClassCache = 0x3D5A5401 (1029329921)
    0x6c: void ** jniIDs = !j9x 0x00000000
    0x70: UDATA lockOffset = 0x00000004 (4)
    0x74: U16 reservedCounter = 0x0000 (0)
    0x76: U16 cancelCounter = 0x0000 (0)
    0x78: UDATA newInstanceCount = 0x00000000 (0)
    0x7c: IDATA backfillOffset = 0x0000000C (12)
    0x80: struct J9Class * replacedClass = !j9class 0x00000000
    0x84: UDATA finalizeLinkOffset = 0x00000000 (0)
    0x88: struct J9Class * nextClassInSegment = !j9class 0x3F7A9B00 // com/ibm/jtc/svt/lambda/TestLambdaJavaInterfaces
    0x8c: UDATA * ramConstantPool = !j9x 0x3F7A9C40
    0x90: j9object_t * callSites = !j9x 0x00000000
    0x94: j9object_t * methodTypes = !j9x 0x00000000
    0x98: j9object_t * varHandleMethodTypes = !j9x 0x00000000
    0x9c: struct J9VMCustomSpinOptions * customSpinOption = !j9vmcustomspinoptions 0x00000000
    0xa0: struct J9Method ** staticSplitMethodTable = !j9x 0x00000000
    0xa4: struct J9Method ** specialSplitMethodTable = !j9x 0x00000000
    0xa8: struct J9JITExceptionTable * jitMetaDataList = !j9jitexceptiontable 0x00000000
    0xac: struct J9Class * gcLink = !j9class 0x00000000
    0xb0: struct J9Class * hostClass = !j9class 0x3F7A9D00 // com/ibm/jtc/svt/lambda/support/Animal
    0xb4: struct J9FlattenedClassCache * flattenedClassCache = !j9flattenedclasscache 0x00000000
    0xb8: struct J9ClassHotFieldsInfo * hotFieldsInfo = !j9classhotfieldsinfo 0x00000000
}
Class name: com/ibm/jtc/svt/lambda/support/Animal
To view static fields, use !j9statics 0x3F7A9D00
To view instance shape, use !j9classshape 0x3F7A9D00

ramConstantPool = !j9x 0x3F7A9C40, problematic value at 0x3F7A9C48

0x3F7A9C40 :  00000009 00000000 3cfbd200 00000009 [ ........<....... ] <---
0x3F7A9C50 :  00300036 00340037 00340039 00300032 [ .0.6.4.7.4.9.0.2 ]
0x3F7A9C60 :  00300000 00000000 3d634e00 00000000 [ .0......=cN..... ]
0x3F7A9C70 :  1fa12b98 0000000c 80000010 00000000 [ ..+............. ]
0x3F7A9C80 :  3cfbd200 00000010 002d0030 00300031 [ <........-.0.0.1 ]
0x3F7A9C90 :  00340037 00350037 00330032 00350032 [ .4.7.5.7.3.2.5.2 ]

from !dumpromclass 0x3F831520:

CP Shape Description:
  . S v i
  C v v v
  C C
pshipton commented 12 months ago

Similar failures from an internal test. [zOS S390] 80 Load_Level_2.abbs.5mins.Mode121 fyrec511 Diagnostics files are attached to the job. Not reproduced in 110 grinders.

j> 13:50:20 20231030 13:50:20 Runtime State Reporter IMPORTANT: 4294 tests complete, 299 currently running
j> 13:50:31 Pointer: 00000001 has is not object aligned (to 8 bytes) 
j> 13:50:31 Pointer: 00760065 has is not object aligned (to 8 bytes) 
j> 13:50:31 20:50:30.975 0x3dae1700    j9mm.141    *   ** ASSERTION FAILED ** at ./MarkingScheme.hpp:70: ((false))
j> 13:50:31 20:50:30.975 0x3e378e00    j9mm.141    *   ** ASSERTION FAILED ** at ./MarkingScheme.hpp:70: ((false))
j> 13:50:31 JVMDUMP039I Processing dump event "traceassert", detail "" at 2023/10/30 13:50:30 - please wait.
j> 13:50:31 JVMDUMP039I Processing dump event "traceassert", detail "" at 2023/10/30 13:50:30 - please wait.
j> 13:50:31 JVMDUMP032I JVM requested System dump using 'J9BUILD.JVM.TDUMP.J9BUILD9.D231030.T135030' in response to an event
dmitripivkine commented 12 months ago

Right, GC Check discovers number of problems with classes memory corruption:

Checking CLASS HEAP...
  <gc check (1): from debugger: CLASS HEAP: superclass slot 4aac6500(4aaa8ed8) -> 0: class pointer is null>
  <gc check (2): from debugger: CLASS HEAP: Class 4aac3900: corrupt data exception>
  <gc check (3): from debugger: CLASS HEAP: constant slot 4aac3300(4aaa8eb8) -> 760065: corrupt data exception>
  <gc check (4): from debugger: CLASS HEAP: static slot 4aac2b00(4aaa8e98) -> 540065: corrupt data exception>
  <gc check (5): from debugger: CLASS HEAP: Class 4aaa8f00: class header invalid>
  <gc check (6): from debugger: CLASS HEAP: Class 4aaa8f00: corrupt data exception>
dmitripivkine commented 12 months ago

I noticed all observed failures occur for single region object heap configuration, GC policies optthruput or optavgpause. I don't think however this one is GC related.