eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.28k stars 721 forks source link

Clean-up compressed refs generation in the JIT IL Generator #4453

Open andrewcraik opened 5 years ago

andrewcraik commented 5 years ago

Currently the JIT IL generator generates compressedRefs anchors when running a 64bit JVM in a configuration that uses compressed references to store heap pointers in 32bits rather than 64bit in heap objects. The anchor nodes were originally needed during il generation and optimization to support non-zero heap bases - eg having to add a constant offset to heap pointers. Support for this mode was removed from most of the system some time ago. The compressedRefs anchors are the last holdovers from this. These anchors serve no useful purpose - during code generation we delete all of these anchors and check all of the aloadi and astorei operations to see if they need decompression / compression respectively. This item is to track cleaning up the generation of compressedRef nodes from the IL generator since they are not needed and are simply wasting compile-time and memory

andrewcraik commented 5 years ago

The first part of this item will be to look in the ilgen directory of OpenJ9 and identify the code that generates compressedRef anchors - primarily Walker.cpp. The anchor will need to be cleaned up and, where necessary, treetop anchors added to preserve the order of evaluation if necessary. Once the generation has been cleaned up we will need to run functional and performance testing to ensure there are no surprises from the removal. Perf testing could start by running something as simple as DaCaPo. Once the initial test are done IBM can also help perf test the change before it is merged to help ensure the performance characteristics are as desired.

andrewcraik commented 5 years ago

@vesuppi you mentioned on the OpenJ9 slack that you were interested in this one so I'm mentioning you here

tongzhou80 commented 5 years ago

Hi Andrew, thanks for the mentioning. Yes I am looking at it.

andrewcraik commented 5 years ago

FYI @liqunl and @cathyzhyi

tongzhou80 commented 5 years ago

Hi,

Sorry I am still struggling with this issue. The way I was thinking of dealing with this is to 1. make a copy of the genCompressedRef method and make changes to it so it generates treetop nodes instead of compressedRef nodes. And then I'll replace every call to genCompressedRef with my own copy of it. But I've been encountering segfaults with my own version of genCompressedRef. I think it might be because somehow I misunderstood the semantic of this method and thus missed something.

So the original method is:

TR::Node *
TR_J9ByteCodeIlGenerator::genCompressedRefs(TR::Node * address, bool genTT, int32_t isLoad)
      {
   static char *pEnv = feGetEnv("TR_UseTranslateInTrees");

   TR::Node *value = address;
   if (pEnv && (isLoad < 0)) // store                                                                                                                         
      value = address->getSecondChild();
   TR::Node *newAddress = TR::Node::createCompressedRefsAnchor(value);
   //traceMsg(comp(), "compressedRefs anchor %p generated\n", newAddress);                                                                                    

   if (trace())
      traceMsg(comp(), "IlGenerator: Generating compressedRefs anchor [%p] for node [%p]\n", newAddress, address);

   if (!pEnv && genTT)
      {
      genTreeTop(newAddress);
      return NULL;
      }
   else
      {
      return newAddress;
      }
   }

My version is:

// tz+                                                                                                                                                        
// Goal: To resolve https://github.com/eclipse/openj9/issues/4453                                                                                             
TR::Node *
TR_J9ByteCodeIlGenerator::genCompressedRefs1(TR::Node * address, bool genTT, int32_t isLoad)
      {
   static char *pEnv = feGetEnv("TR_UseTranslateInTrees");

   TR::Node *value = address;
   if (pEnv && (isLoad < 0)) // store                                                                                                                         
      value = address->getSecondChild();

   /* not gen compressedRefs in this version */
   // TR::Node *newAddress = TR::Node::createCompressedRefsAnchor(value);                                                                                     
   TR::Node *newAddress = TR::Node::create(TR::treetop, 1, value);

   if (trace())
      traceMsg(comp(), "IlGenerator: Generating treetop anchor [%p] for node [%p]\n", newAddress, address);

   //traceMsg(comp(), "compressedRefs anchor %p generated\n", newAddress);                                                                                    

   if (!pEnv && genTT)
      {
      genTreeTop(newAddress);
      return NULL;
      }
   else
      {
      return newAddress;
      }
   }

Did I miss anything? Thanks! @liqunl @cathyzhyi

cathyzhyi commented 5 years ago

@vesuppi I think the change looks ok. Could you paste the crashing backtrace here? Is the failure easy to reproduce?

tongzhou80 commented 5 years ago

@cathyzhyi

Hi Yi, Thanks a lot for the help! Yes, here's the crash when running pmd:

tong@titanxp-system ~/soot-dacapo> /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/bin/java -Xjit:log=logfile.dmp,traceEscape
Analysis -jar dacapo-9.12-MR1-bach.jar pmd
===== DaCapo 9.12-MR1 pmd starting =====
#0: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x6c8096) [0x7f0e8ae51096]
#1: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x6d54ed) [0x7f0e8ae5e4ed]
#2: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x1248b9) [0x7f0e8a8ad8b9]
#3: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9prt29.so(+0x1fd2d) [0x7f0e90609d2d]
#4: /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) [0x7f0e9239e390]
#5: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x29129a) [0x7f0e8aa1a29a]
#6: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x292012) [0x7f0e8aa1b012]
#7: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x299980) [0x7f0e8aa22980]
#8: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x2aadb5) [0x7f0e8aa33db5]
#9: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x2ad175) [0x7f0e8aa36175]
#10: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x5801b5) [0x7f0e8ad091b5]
#11: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x57f901) [0x7f0e8ad08901]
#12: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x57f901) [0x7f0e8ad08901]
#13: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x57f901) [0x7f0e8ad08901]
#14: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x5812c2) [0x7f0e8ad0a2c2]
#15: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x3dbaff) [0x7f0e8ab64aff]
#16: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x131d8e) [0x7f0e8a8bad8e]
#17: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x132a90) [0x7f0e8a8bba90]
#18: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9prt29.so(+0x20e27) [0x7f0e9060ae27]
#19: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x1346d1) [0x7f0e8a8bd6d1]
#20: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x134b61) [0x7f0e8a8bdb61]
#21: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x135210) [0x7f0e8a8be210]
#22: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x1354ea) [0x7f0e8a8be4ea]
#23: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x13559f) [0x7f0e8a8be59f]
#24: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9prt29.so(+0x20e27) [0x7f0e9060ae27]
#25: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so(+0x1358e6) [0x7f0e8a8be8e6]
#26: /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9thr29.so(+0xdf53) [0x7f0e90f73f53]
#27: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f0e923946ba]
#28: function clone+0x6d [0x7f0e928b541d]
Unhandled exception
Type=Segmentation error vmState=0x000518ff
J9Generic_Signal_Number=00000004 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007F0E9121DA00 Handler2=00007F0E90609AC0 InaccessibleAddress=0000000000000000
RDI=00007F0E74F30620 RSI=0000000000000000 RAX=0000000000000000 RBX=0000000000000000
RCX=0000000000000000 RDX=0000000000000020 R8=00000000120E0003 R9=00007F0E8C27F9A0
...

Method_being_compiled=java/lang/ClassLoader.getClassLoadingLock(Ljava/lang/String;)Ljava/lang/Object;
Target=2_90_20190320_000000 (Linux 4.15.0-39-generic)
CPU=amd64 (4 logical CPUs) (0x3d8132000 RAM)
----------- Stack Backtrace -----------
(0x00007F0E8AA1A29A [libj9jit29.so+0x29129a])
(0x00007F0E8AA1B012 [libj9jit29.so+0x292012])
(0x00007F0E8AA22980 [libj9jit29.so+0x299980])
(0x00007F0E8AA33DB5 [libj9jit29.so+0x2aadb5])
(0x00007F0E8AA36175 [libj9jit29.so+0x2ad175])
(0x00007F0E8AD091B5 [libj9jit29.so+0x5801b5])
(0x00007F0E8AD08901 [libj9jit29.so+0x57f901])
(0x00007F0E8AD08901 [libj9jit29.so+0x57f901])
(0x00007F0E8AD08901 [libj9jit29.so+0x57f901])
(0x00007F0E8AD0A2C2 [libj9jit29.so+0x5812c2])
(0x00007F0E8AB64AFF [libj9jit29.so+0x3dbaff])
(0x00007F0E8A8BAD8E [libj9jit29.so+0x131d8e])
(0x00007F0E8A8BBA90 [libj9jit29.so+0x132a90])
(0x00007F0E9060AE27 [libj9prt29.so+0x20e27])
(0x00007F0E8A8BD6D1 [libj9jit29.so+0x1346d1])
(0x00007F0E8A8BDB61 [libj9jit29.so+0x134b61])
(0x00007F0E8A8BE210 [libj9jit29.so+0x135210])
(0x00007F0E8A8BE4EA [libj9jit29.so+0x1354ea])
(0x00007F0E8A8BE59F [libj9jit29.so+0x13559f])
(0x00007F0E9060AE27 [libj9prt29.so+0x20e27])
(0x00007F0E8A8BE8E6 [libj9jit29.so+0x1358e6])
(0x00007F0E90F73F53 [libj9thr29.so+0xdf53])
(0x00007F0E923946BA [libpthread.so.0+0x76ba])
clone+0x6d (0x00007F0E928B541D [libc.so.6+0x10741d])

So I am not sure why my build doesn't have the debug symbol info. I followed the standard build instructions on the openj9 website. What I pasted is probably not helpful. But that's a separate issue.

Also this crashes probabilistically, maybe it crashes once in every 3 runs. My JIT command was -Xjit:log=logfile.dmp,traceEscapeAnalysis. Not specifying anything in -Xjit seems to be fine. But I am not positive, since the crash is probabilistic.

The reason why I turned on traceEscapeAnalysis is because I wanted to dump the IR and check if compressedRefs are really removed.. and I know that traceEscapeAnalysis can dump the method tree during JIT. I am not sure which JIT stage I should check these.. but raceEscapeAnalysis seems to work for me.

tongzhou80 commented 5 years ago

@cathyzhyi

Here's some real backtrace that I can get with gdb:

#0  __pthread_kill (threadid=<optimized out>, signo=11) at ../sysdeps/unix/sysv/linux/pthread_kill.c:62
#1  0x00007fa2b503f395 in omrdump_create ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9prt29.so
#2  0x00007fa2b4af1a7d in doSystemDump ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9dmp29.so
#3  0x00007fa2b4aed445 in protectedDumpFunction ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9dmp29.so
#4  0x00007fa2b5022e27 in omrsig_protect ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9prt29.so
#5  0x00007fa2b4af0daa in runDumpFunction ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9dmp29.so
#6  0x00007fa2b4af0f17 in runDumpAgent ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9dmp29.so
#7  0x00007fa2b4b0673b in triggerDumpAgents ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9dmp29.so
#8  0x00007fa2b5c355de in generateDiagnosticFiles ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9vm29.so
#9  0x00007fa2b5022e27 in omrsig_protect ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9prt29.so
#10 0x00007fa2b5c35780 in vmSignalHandler ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9vm29.so
#11 0x00007fa2b5021d2d in masterSynchSignalHandler ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9prt29.so
#12 <signal handler called>
#13 0x00007fa2af48b29a in TR_EscapeAnalysis::getClassName(TR::Node*) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#14 0x00007fa2af48c012 in TR_EscapeAnalysis::createCandidateIfValid(TR::Node*, TR_OpaqueClassBlock*&, bool) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#15 0x00007fa2af493980 in TR_EscapeAnalysis::findCandidates() ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#16 0x00007fa2af4a4db5 in TR_EscapeAnalysis::performAnalysisOnce() ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#17 0x00007fa2af4a7175 in TR_EscapeAnalysis::perform() ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#18 0x00007fa2af77a1b5 in OMR::Optimizer::performOptimization(OptimizationStrategy const*, int, int, int) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#19 0x00007fa2af779901 in OMR::Optimizer::performOptimization(OptimizationStrategy const*, int, int, int) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#20 0x00007fa2af779901 in OMR::Optimizer::performOptimization(OptimizationStrategy const*, int, int, int) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#21 0x00007fa2af779901 in OMR::Optimizer::performOptimization(OptimizationStrategy const*, int, int, int) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#22 0x00007fa2af77b2c2 in OMR::Optimizer::optimize() ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#23 0x00007fa2af5d5aff in OMR::Compilation::compile() ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#24 0x00007fa2af32bd8e in TR::CompilationInfoPerThreadBase::compile(J9VMThread*, TR::Compilation*, TR_ResolvedMethod*, TR_J9VMBase&, TR_OptimizationPlan*, TR:--Type <RET> for more, q to quit, c to continue without paging--
:SegmentAllocator const&) () from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#25 0x00007fa2af32ca90 in TR::CompilationInfoPerThreadBase::wrappedCompile(J9PortLibrary*, void*) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#26 0x00007fa2b5022e27 in omrsig_protect ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9prt29.so
#27 0x00007fa2af32e6d1 in TR::CompilationInfoPerThreadBase::compile(J9VMThread*, TR_MethodToBeCompiled*, J9::J9SegmentProvider&) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#28 0x00007fa2af32eb61 in TR::CompilationInfoPerThread::processEntry(TR_MethodToBeCompiled&, J9::J9SegmentProvider&) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#29 0x00007fa2af32f210 in TR::CompilationInfoPerThread::processEntries() ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#30 0x00007fa2af32f4ea in TR::CompilationInfoPerThread::run() ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#31 0x00007fa2af32f59f in protectedCompilationThreadProc(J9PortLibrary*, TR::CompilationInfoPerThread*) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#32 0x00007fa2b5022e27 in omrsig_protect ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9prt29.so
#33 0x00007fa2af32f8e6 in compilationThreadProc(void*) ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9jit29.so
#34 0x00007fa2b598bf53 in thread_wrapper ()
   from /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/lib/amd64/compressedrefs/libj9thr29.so
#35 0x00007fa2b6dac6ba in start_thread (arg=0x7fa2ac782700) at pthread_create.c:333
#36 0x00007fa2b72cd41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Okay I guess it's due to the escape analysis then.

cathyzhyi commented 5 years ago

Yea, the crashing place is in EA but it could be that something was messed up earlier. Could you try adding trace option for the crashing compiling method java/lang/ClassLoader.getClassLoadingLock(Ljava/lang/String;)Ljava/lang/Object; using -Xjit:{java/lang/ClassLoader.getClassLoadingLock(Ljava/lang/String;)Ljava/lang/Object;}(log=logfile.dmp,traceEscapeAnalysis,tracefull,traceilgen)? tracefull will print out trees after each optimization, traceilgen is for debugging info about IL generation. Could you try getting the tracing log and upload the log here?

tongzhou80 commented 5 years ago

@cathyzhyi

Hi Yi,

Thanks for the suggestion. The command wouldn't run, it said syntax error near unexpected token (

(base) tong@titanxp-system:~/soot-dacapo$ /home/tong/openj9-dev/build/linux-x86_64-normal-server-release/images/j2re-image/bin/java -Xjit:{java/lang/ClassLoader.getClassLoadingLock(Ljava/lang/String;)Ljava/lang/Object;}(log=logfile.dmp,traceEscapeAnalysis,tracefull,traceilgn) -jar dacapo-9.12-MR1-bach.jar pmd
bash: syntax error near unexpected token `('
cathyzhyi commented 5 years ago

@vesuppi sry I forgot, you need single quote around the jit options '-Xjit:{java/lang/ClassLoader.getClassLoadingLock(Ljava/lang/String;)Ljava/lang/Object;}(log=logfile.dmp,traceEscapeAnalysis,tracefull,traceilgen)'

tongzhou80 commented 5 years ago

@cathyzhyi

Hi Yi,

the traceilgen is not recognized by the VM. After removing traceilgen I got such files:

hmm github says image

I uploaded the files to https://gitlab.com/tongzhou/publicfiles/issues/1. Thanks!

cathyzhyi commented 5 years ago

@vesuppi The last few lines in EA shows tracing stopped at node 7F203A0EE7C0

4 setting local alloc 00007F203A0EDBE0 to true
Found [00007F203A0EE450] new java/lang/ClassLoader$ClassNameBasedLock
4 setting local alloc 00007F203A0EE450 to true
Found [00007F203A0EE7C0] new java/lang/ClassLoader$ClassNameLockRef 

If you search for "found.*new" in EscapeAnalysis.cpp you will find where the last line got printed https://github.com/eclipse/openj9/blob/cd60b0571f1ec9968a576a174edae7ef0f824cb9/runtime/compiler/optimizer/EscapeAnalysis.cpp#L1433 So it seems the crashing is because of the following node if you search back in the tracing log for address 7F203A0EE7C0

n129n     treetop                                                                             [0x7f203a0ee810] bci=[-1,106,1018] rc=0 vc=127 vn=139 li=- udi=- nc=1
n128n       new  jitNewObject[#88  helper Method] [flags 0x400 0x0 ] (highWordZero Unsigned X!=0 allocationCanBeRemoved sharedMemory )  [0x7f203a0ee7c0] bci=[-1,106,1018] rc=3 vc=127 vn=32 li=- udi=- nc=1 flg=0x4004
n127n         loadaddr  java/lang/ClassLoader$ClassNameLockRef[#365  Static] [flags 0x18307 0x0 ]  [0x7f203a0ee770] bci=[-1,106,1018] rc=1 vc=127 vn=19 li=- udi=- nc=0

This seems to be unrelated your change because the new Object node has nothing to do with compressedrefs which should only be needed when loading or storing address to heap memory. Maybe something else is wrong here.

Are you building your work on top of Openjdk Version 11 with openj9 and running dacapo? I will see if I can reproduce this locally.

tongzhou80 commented 5 years ago

@cathyzhyi

Hi Yi,

Thank you for the detailed analysis. It's really cool that you can pinpoint the problematic location from these giant log files.

I think I am using openjdk8 (https://www.eclipse.org/openj9/oj9_build.html#version-8). DaCapo has several different versions and I am using the default download from http://dacapobench.org/. Thanks!

cathyzhyi commented 5 years ago

@vesuppi Hey, the crash with traceEscapeAnalysis is because https://github.com/eclipse/openj9/blob/eadc32af8ec8645afd86f5e166f4f94b9eeab843/runtime/compiler/optimizer/EscapeAnalysis.cpp#L1543-L1547 where the the argument of getClassName should be just classNode instead of classNode->getSecondChild() which is fixed by this PR https://github.com/eclipse/openj9/pull/5213. This crash should only happen when you use traceEscapeAnalysis . Can you try running without traceEscapeAnalysis or just make the above fix I mentioned in your local repository since https://github.com/eclipse/openj9/pull/5213 is not merged yet. Let me know if there is still any failures.

tongzhou80 commented 5 years ago

@cathyzhyi

Thanks a lot for locating the error! I'll try changing classNode->getSecondChild() to classNode and see if it fixes everything and let you know. Currently I am having some issue building the project after pulling the from openjdk upstream.

cathyzhyi commented 5 years ago

@cathyzhyi

Thanks a lot for locating the error! I'll try changing classNode->getSecondChild() to classNode and see if it fixes everything and let you know. Currently I am having some issue building the project after pulling the from openjdk upstream.

you might want to check the adoptOpenJDK nightly build and see what sha of each component are they using. I suspect it might not work if your version of openjdk and openj9 and omr are too out of sync.

tongzhou80 commented 5 years ago

@cathyzhyi

Thanks for the suggestion. I synced up both openj9 and openjdk, but it didn't seem to build with my old docker image. I often saw people discuss AdoptOpenJDK in the openj9 slack, but not sure how to use it together with openj9 source. Its website (https://adoptopenjdk.net/releases.html) seems to provide nightly builds for the entire project, do you know how to use AdoptOpenJDK's binary only for the jdk part and build openj9 and omr from source? Thanks!

cathyzhyi commented 5 years ago

@vesuppi The night builds are built with https://github.com/ibmruntimes/openj9-openjdk-jdk8 and https://github.com/eclipse/openj9/ and https://github.com/eclipse/omr. You can check which shas are the nightly build built with by checking the java -version output. Take this nightly build for example: https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u-2019-04-03-17-10/OpenJDK8U-jdk_x64_linux_openj9_2019-04-03-17-10.tar.gz The java -version output is the following noting the sha numbers after each component

OpenJDK Runtime Environment (build 1.8.0_212-201904030331-b01)
Eclipse OpenJ9 VM (build master-8e54c90af, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20190403_269 (JIT enabled, AOT enabled)
OpenJ9   - 8e54c90af
OMR      - 4c5291ab
JCL      - 2b68f66d68 based on )

The build is built with: openj9: https://github.com/eclipse/openj9/commit/8e54c90af omr: https://github.com/eclipse/omr/commit/4c5291ab JCL version: https://github.com/ibmruntimes/openj9-openjdk-jdk8/commit/2b68f66d68 notice the sha of the commits matches the sha in the java -version output. Hopefully if you rebase your local repo on those shas the build can succeed. If you still have problem, please ask on the openj9 slack channel to describe the build failure in more details.

tongzhou80 commented 5 years ago

@cathyzhyi

Great that worked for me. Thanks a lot! I reset my repos to match OpenJDK8U-jdk_x64_linux_openj9_2019-04-03-17-10.tar.gz. But my previous edits were also lost. Let me add that back in and try to change classNode->getSecondChild() to classNode and let you know. Thanks!

tongzhou80 commented 5 years ago

@cathyzhyi

It seems like changing classNode->getSecondChild() to classNode has fixed the problem! However, when I tried to check if the IR still contains compressedRefs, I can still find some:

logfile.dmp.2578.56959.20190406.092239.2578:n4881n    compressedRefs                                                                      [0x7f2e6c2f2560] bci=[28,55,1713] rc=0 vc=497 vn=574 li=- udi=- nc=2
logfile.dmp.2578.56959.20190406.092239.2578-n4847n      awrtbari  <unsafe shadow sym>[#606  Shadow] [flags 0x80000607 0x100 ] (sharedMemory )  [0x7f2e6c2f1ac0] bci=[28,55,1713] rc=1 vc=497 vn=80 li=- udi=- nc=3 flg=0x20
logfile.dmp.2578.56959.20190406.092239.2578-n4879n        aladd (X>=0 sharedMemory )                                                      [0x7f2e6c2f24c0] bci=[28,51,1713] rc=1 vc=497 vn=72 li=- udi=- nc=2 flg=0x100
logfile.dmp.2578.56959.20190406.092239.2578-n4844n          ==>aload
--
logfile.dmp.2578.56959.20190406.092239.2578:n4888n    compressedRefs                                                                      [0x7f2e6c2f2790] bci=[28,68,1715] rc=0 vc=497 vn=582 li=- udi=- nc=2
logfile.dmp.2578.56959.20190406.092239.2578-n4854n      awrtbari  <unsafe shadow sym>[#606  Shadow] [flags 0x80000607 0x100 ] (sharedMemory )  [0x7f2e6c2f1cf0] bci=[28,68,1715] rc=1 vc=497 vn=81 li=- udi=- nc=3 flg=0x20
logfile.dmp.2578.56959.20190406.092239.2578-n4886n        aladd (X>=0 sharedMemory )                                                      [0x7f2e6c2f26f0] bci=[28,64,1715] rc=1 vc=497 vn=74 li=- udi=- nc=2 flg=0x100
logfile.dmp.2578.56959.20190406.092239.2578-n4851n          ==>aload
--
logfile.dmp.2578.56959.20190406.092239.2578:n2602n    compressedRefs                                                                      [0x7f2e6c265d10] bci=[19,3,291] rc=0 vc=360 vn=662 li=- udi=- nc=2
logfile.dmp.2578.56959.20190406.092239.2578-n2568n      awrtbari  <unsafe shadow sym>[#651  Shadow] [flags 0x82000607 0x100 ] (sharedMemory )  [0x7f2e6c265270] bci=[19,3,291] rc=1 vc=360 vn=181 li=- udi=- nc=3 flg=0x20
logfile.dmp.2578.56959.20190406.092239.2578-n2569n        aladd (X>=0 sharedMemory )                                                      [0x7f2e6c2652c0] bci=[19,3,291] rc=1 vc=360 vn=193 li=- udi=- nc=2 flg=0x100
logfile.dmp.2578.56959.20190406.092239.2578-n2570n          aload  <temp slot 10>[#648  Auto] [flags 0x7 0x0 ] (X!=0 sharedMemory )       [0x7f2e6c265310] bci=[19,3,291] rc=1 vc=360 vn=176 li=285 udi=267 nc=0 flg=0x4
--
logfile.dmp.2578.56959.20190406.092239.2578:n2602n    compressedRefs                                                                      [0x7f2e6c265d10] bci=[19,3,291] rc=0 vc=360 vn=662 li=- udi=- nc=2
logfile.dmp.2578.56959.20190406.092239.2578-n2568n      awrtbari  <unsafe shadow sym>[#651  Shadow] [flags 0x82000607 0x100 ] (sharedMemory )  [0x7f2e6c265270] bci=[19,3,291] rc=1 vc=360 vn=181 li=- udi=- nc=3 flg=0x20
logfile.dmp.2578.56959.20190406.092239.2578-n2569n        aladd (X>=0 sharedMemory )                                                      [0x7f2e6c2652c0] bci=[19,3,291] rc=1 vc=360 vn=193 li=- udi=- nc=2 flg=0x100
logfile.dmp.2578.56959.20190406.092239.2578-n2570n          aload  <temp slot 10>[#648  Auto] [flags 0x7 0x0 ] (X!=0 sharedMemory )       [0x7f2e6c265310] bci=[19,3,291] rc=1 vc=360 vn=176 li=285 udi=267 nc=0 flg=0x4
--
logfile.dmp.2578.56959.20190406.092239.2578:n2602n    compressedRefs                                                                      [0x7f2e6c265d10] bci=[19,3,291] rc=0 vc=360 vn=662 li=- udi=- nc=2
logfile.dmp.2578.56959.20190406.092239.2578-n2568n      awrtbari  <unsafe shadow sym>[#651  Shadow] [flags 0x82000607 0x100 ] (sharedMemory )  [0x7f2e6c265270] bci=[19,3,291] rc=1 vc=360 vn=181 li=- udi=- nc=3 flg=0x20
logfile.dmp.2578.56959.20190406.092239.2578-n2569n        aladd (X>=0 sharedMemory )                                                      [0x7f2e6c2652c0] bci=[19,3,291] rc=1 vc=360 vn=193 li=- udi=- nc=2 flg=0x100
logfile.dmp.2578.56959.20190406.092239.2578-n2570n          aload  <temp slot 10>[#648  Auto] [flags 0x7 0x0 ] (X!=0 sharedMemory )       [0x7f2e6c265310] bci=[19,3,291] rc=1 vc=360 vn=176 li=285 udi=267 nc=0 flg=0x4 

It seems like just awrtbari is attached to a compressedRef, it that normal after disabling genCompressedRefs?

liqunl commented 5 years ago

UnsafeFastPath.cpp generate compressedRef as well. I think you can turn it into a treetop node.

cathyzhyi commented 5 years ago

And this one as well https://github.com/eclipse/openj9/blob/4fe6371d509f90835bc33f577ab67f17595d4a62/runtime/compiler/optimizer/InlinerTempForJ9.cpp#L456.

tongzhou80 commented 5 years ago

@cathyzhyi @liqunl

Thanks for pointing it out! So basically all uses of genCompressedRefs in all files?

liqunl commented 5 years ago

Yes

tongzhou80 commented 5 years ago

Great thanks!

tongzhou80 commented 5 years ago

@liqunl

Hi Liqun, I just saw your pull request about "Do not generate compressedrefs and awrtbari for static field". So since this pull request has been merged, then why are some awrtbari are still hanged to compressedRefs? Sorry if I misunderstood something.

liqunl commented 5 years ago

That change only stops generating compressedRefs for static fields. Static fields are not compressed in openj9. Instance fields load/store will still be anchored with compressedRefs in compressedRefs mode.

tongzhou80 commented 5 years ago

Sorry I've been on a short vacation and travelling during the past month and got held up by a number of things. I am now getting back to fixing this issue. Very sorry for the delay!

jdmpapin commented 2 years ago

during code generation we delete all of these anchors and check all of the aloadi and astorei operations to see if they need decompression / compression respectively

I thought this was the case, but I found that while we do check all of the loads and stores, we don't first delete the anchors. On my back burner, I've been working on actually deleting them to make the statement true and bring us into the starting state assumed in this issue, but unfortunately this deletion hasn't been going as smoothly as I had hoped. Rather, it has been exposing cases in which the anchor reinsertion is incomplete or broken