eclipse-openj9 / openj9

Eclipse OpenJ9: A Java Virtual Machine for OpenJDK that's optimized for small footprint, fast start-up, and high throughput. Builds on Eclipse OMR (https://github.com/eclipse/omr) and combines with the Extensions for OpenJDK for OpenJ9 repo.
Other
3.23k stars 713 forks source link

Undeterministic Segfaults in JIT #17419

Open connglli opened 1 year ago

connglli commented 1 year ago

This is a crash, so I didn't put it into PR17404. I wanted to track it seperatly.

Java version

The same version as in PR17404.

openjdk version "11.0.20-internal" 2023-07-18
OpenJDK Runtime Environment (build 11.0.20-internal+0-adhoc..openj9-openjdk-jdk11)
Eclipse OpenJ9 VM (build master-8aa8676, JRE 11 Linux amd64-64-Bit Compressed References 20230512_000000 (JIT enabled, AOT enabled)
OpenJ9   - 8aa8676
OMR      - 779c51b
JCL      - ee54452 based on jdk-11.0.20+2)

Javac version

javac 11.0.20-internal

Code and summary of the problem

A JIT bug, not deterministic.

See tests and diagnostic files in issue17419.tar.gz.

Also, the test (Test.java) is a bit long and cannot be deterministically reduced. It there're multiple issues found, let's start new issues. Thanks!

Sample segfaults

Segfaults in Simplifier:

Unhandled exception
Type=Segmentation error vmState=0x000507ff
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007FB24EE2FA00 Handler2=00007FB24E62C1F0 InaccessibleAddress=0000000000000038
RDI=00007FB22C7CDF10 RSI=00007FB22C818510 RAX=00007FB22C7CC7E0 RBX=00007FB22C7CC5A0
RCX=00007FB22C818518 RDX=0000000000000000 R8=00007FB22C7CDCE8 R9=00007FB22C7AA5A0
R10=0000000000000000 R11=00007FB22C5F6140 R12=00007FB22C7CC5A0 R13=00007FB22C7C8B80
R14=0000000000000004 R15=0000000000000000
RIP=00007FB24CF48CAE GS=0000 FS=0000 RSP=00007FB22C5C85E0
EFlags=0000000000010202 CS=0033 RBP=00007FB22C7CDC10 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000038
xmm0 00007fb22c7cef80 (f: 746385280.000000, d: 6.936841e-310)
xmm1 00007fb22c7cf360 (f: 746386304.000000, d: 6.936841e-310)
xmm2 00007fb22c7cf360 (f: 746386304.000000, d: 6.936841e-310)
xmm3 00007fb22c7cef80 (f: 746385280.000000, d: 6.936841e-310)
xmm4 00007fb22c7cdc10 (f: 746380288.000000, d: 6.936841e-310)
xmm5 00007fb22c7ccf50 (f: 746377024.000000, d: 6.936841e-310)
xmm6 00007fb22c7ccc10 (f: 746376192.000000, d: 6.936841e-310)
xmm7 00007fb22c7cc7e0 (f: 746375168.000000, d: 6.936841e-310)
xmm8 6f6e67695f525400 (f: 1599230976.000000, d: 5.762047e+228)
xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
Module=/home/simon/JVMs/openj9/openj9-openjdk-jdk11/build/linux-x86_64-normal-server-release/images/jdk/lib/default/libj9jit29.so
Module_base_address=00007FB24C735000

Method_being_compiled=Test.vMeth(I)V
Target=2_90_20230512_000000 (Linux 5.15.0-71-generic)
CPU=amd64 (8 logical CPUs) (0x7cda7a000 RAM)
----------- Stack Backtrace -----------
_ZN18TR_RegionStructure23cleanupAfterEdgeRemovalEPN2TR7CFGNodeE+0xae (0x00007FB24CF48CAE [libj9jit29.so+0x813cae])
_ZN18TR_RegionStructure20removeExternalEdgeToEP12TR_Structurei+0x202 (0x00007FB24CF49752 [libj9jit29.so+0x814752])
_ZN18TR_RegionStructure10removeEdgeEP12TR_StructureS1_+0xc3 (0x00007FB24CF493A3 [libj9jit29.so+0x8143a3])
_ZN3OMR3CFG10removeEdgeEPN2TR7CFGEdgeE+0x86c (0x00007FB24CCE7EEC [libj9jit29.so+0x5b2eec])
_ZN3OMR3CFG23removeUnreachableBlocksEv+0x2d4 (0x00007FB24CCE75C4 [libj9jit29.so+0x5b25c4])
_ZN3OMR10Simplifier7performEv+0xfc (0x00007FB24CEEFDCC [libj9jit29.so+0x7badcc])
_ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii+0x7b7 (0x00007FB24CE9FE27 [libj9jit29.so+0x76ae27])
_ZN3OMR9Optimizer8optimizeEv+0x1db (0x00007FB24CEA16FB [libj9jit29.so+0x76c6fb])
_ZN3OMR11Compilation7compileEv+0x925 (0x00007FB24CC958D5 [libj9jit29.so+0x5608d5])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadPNS_11CompilationEP17TR_ResolvedMethodR11TR_J9VMBaseP19TR_OptimizationPlanRKNS_16SegmentAllocatorE+0x4dd (0x00007FB24C88046D [libj9jit29.so+0x14b46d])
_ZN2TR28CompilationInfoPerThreadBase14wrappedCompileEP13J9PortLibraryPv+0x314 (0x00007FB24C8814D4 [libj9jit29.so+0x14c4d4])
omrsig_protect+0x1e3 (0x00007FB24E62CF53 [libj9prt29.so+0x29f53])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadP21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x309 (0x00007FB24C87EBE9 [libj9jit29.so+0x149be9])
_ZN2TR24CompilationInfoPerThread12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x1c0 (0x00007FB24C87F230 [libj9jit29.so+0x14a230])
_ZN2TR24CompilationInfoPerThread14processEntriesEv+0x3b3 (0x00007FB24C87DD33 [libj9jit29.so+0x148d33])
_ZN2TR24CompilationInfoPerThread3runEv+0x42 (0x00007FB24C87E232 [libj9jit29.so+0x149232])
_Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0x82 (0x00007FB24C87E2E2 [libj9jit29.so+0x1492e2])
omrsig_protect+0x1e3 (0x00007FB24E62CF53 [libj9prt29.so+0x29f53])
_Z21compilationThreadProcPv+0x1cf (0x00007FB24C87E71F [libj9jit29.so+0x14971f])
thread_wrapper+0x162 (0x00007FB24EBDE322 [libj9thr29.so+0xe322])
start_thread+0xd9 (0x00007FB24F907609 [libpthread.so.0+0x8609])
clone+0x43 (0x00007FB24FA63133 [libc.so.6+0x11f133])
---------------------------------------

Segfaults in Simplifier (different from the last one):

Unhandled exception
Type=Segmentation error vmState=0x000507ff
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007F0AD5F69A00 Handler2=00007F0AD57661F0 InaccessibleAddress=0000000000000038
RDI=00007F0AAD36F358 RSI=00007F0AAD36F1E0 RAX=00007F0AAD371178 RBX=00007F0AAD36EEA0
RCX=0000000000000008 RDX=00007F0AAD371198 R8=00007F0AAD36E5C8 R9=00007F0AAD35C9C0
R10=00000000FFFFFFFB R11=00007F0AAD1FC140 R12=00007F0AAD36CE80 R13=0000000000000000
R14=00007F0AAD371150 R15=00007F0AAD1FA030
RIP=00007F0ACF98CE03 GS=0000 FS=0000 RSP=00007F0AB46DD550
EFlags=0000000000010246 CS=0033 RBP=00007F0AAD36E990 ERR=0000000000000006
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000038
xmm0 00007f0aad36d420 (f: 2906051584.000000, d: 6.901298e-310)
xmm1 00007f0aad36d420 (f: 2906051584.000000, d: 6.901298e-310)
xmm2 00007f0aad36e4a0 (f: 2906055936.000000, d: 6.901298e-310)
xmm3 00007f0aad36d830 (f: 2906052608.000000, d: 6.901298e-310)
xmm4 00007f0aad36e4a0 (f: 2906055936.000000, d: 6.901298e-310)
xmm5 00007f0aad36d830 (f: 2906052608.000000, d: 6.901298e-310)
xmm6 00007f0aad36d4f0 (f: 2906051840.000000, d: 6.901298e-310)
xmm7 00007f0aad36d280 (f: 2906051072.000000, d: 6.901298e-310)
xmm8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm10 6f6f1065c2f2b1cf (f: 3270685184.000000, d: 5.887146e+228)
xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
Module=/home/simon/JVMs/openj9/jdk11.8aa8676/lib/default/libj9jit29.so
Module_base_address=00007F0ACF17F000

Method_being_compiled=Test.vMeth(I)V
Target=2_90_20230512_000000 (Linux 5.15.0-71-generic)
CPU=amd64 (8 logical CPUs) (0x7cda7a000 RAM)
----------- Stack Backtrace -----------
_ZN18TR_RegionStructure11replacePartEP12TR_StructureS1_+0xe3 (0x00007F0ACF98CE03 [libj9jit29.so+0x80de03])
_ZN18TR_RegionStructure23cleanupAfterNodeRemovalEv+0x98 (0x00007F0ACF98E418 [libj9jit29.so+0x80f418])
_ZN18TR_RegionStructure23cleanupAfterEdgeRemovalEPN2TR7CFGNodeE+0x141 (0x00007F0ACF992D41 [libj9jit29.so+0x813d41])
_ZN18TR_RegionStructure20removeExternalEdgeToEP12TR_Structurei+0x202 (0x00007F0ACF993752 [libj9jit29.so+0x814752])
_ZN18TR_RegionStructure10removeEdgeEP12TR_StructureS1_+0xc3 (0x00007F0ACF9933A3 [libj9jit29.so+0x8143a3])
_ZN3OMR3CFG10removeEdgeEPN2TR7CFGEdgeE+0x86c (0x00007F0ACF731EEC [libj9jit29.so+0x5b2eec])
_ZN3OMR3CFG23removeUnreachableBlocksEv+0x2d4 (0x00007F0ACF7315C4 [libj9jit29.so+0x5b25c4])
_ZN3OMR10Simplifier7performEv+0xfc (0x00007F0ACF939DCC [libj9jit29.so+0x7badcc])
_ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii+0x7b7 (0x00007F0ACF8E9E27 [libj9jit29.so+0x76ae27])
_ZN3OMR9Optimizer8optimizeEv+0x1db (0x00007F0ACF8EB6FB [libj9jit29.so+0x76c6fb])
_ZN3OMR11Compilation7compileEv+0x925 (0x00007F0ACF6DF8D5 [libj9jit29.so+0x5608d5])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadPNS_11CompilationEP17TR_ResolvedMethodR11TR_J9VMBaseP19TR_OptimizationPlanRKNS_16SegmentAllocatorE+0x4dd (0x00007F0ACF2CA46D [libj9jit29.so+0x14b46d])
_ZN2TR28CompilationInfoPerThreadBase14wrappedCompileEP13J9PortLibraryPv+0x314 (0x00007F0ACF2CB4D4 [libj9jit29.so+0x14c4d4])
omrsig_protect+0x1e3 (0x00007F0AD5766F53 [libj9prt29.so+0x29f53])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadP21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x309 (0x00007F0ACF2C8BE9 [libj9jit29.so+0x149be9])
_ZN2TR24CompilationInfoPerThread12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x1c0 (0x00007F0ACF2C9230 [libj9jit29.so+0x14a230])
_ZN2TR24CompilationInfoPerThread14processEntriesEv+0x3b3 (0x00007F0ACF2C7D33 [libj9jit29.so+0x148d33])
_ZN2TR24CompilationInfoPerThread3runEv+0x42 (0x00007F0ACF2C8232 [libj9jit29.so+0x149232])
_Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0x82 (0x00007F0ACF2C82E2 [libj9jit29.so+0x1492e2])
omrsig_protect+0x1e3 (0x00007F0AD5766F53 [libj9prt29.so+0x29f53])
_Z21compilationThreadProcPv+0x1cf (0x00007F0ACF2C871F [libj9jit29.so+0x14971f])
thread_wrapper+0x162 (0x00007F0AD5D18322 [libj9thr29.so+0xe322])
start_thread+0xd9 (0x00007F0AD6A41609 [libpthread.so.0+0x8609])
clone+0x43 (0x00007F0AD6B9D133 [libc.so.6+0x11f133])
---------------------------------------

Segfaults in Liveness Analysis:

Unhandled exception
Type=Segmentation error vmState=0x000522ff
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007FB83DE67A00 Handler2=00007FB83D6641F0 InaccessibleAddress=0000000000000049
RDI=00007FB81C5095A0 RSI=0000000000000000 RAX=00007FB817192B70 RBX=00007FB81C5095A0
RCX=00007FB81C5085A0 RDX=0000000000000000 R8=0800000000000000 R9=0000000000000004
R10=0000000000000040 R11=0800000000000000 R12=00007FB81C5083F0 R13=00007FB81C508220
R14=00007FB817194880 R15=00007FB817192930
RIP=00007FB837745CE4 GS=0000 FS=0000 RSP=00007FB81C508190
EFlags=0000000000010246 CS=0033 RBP=0000000000000000 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000049
xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm4 00007fb8171933b0 (f: 387527616.000000, d: 6.938097e-310)
xmm5 00007fb817193070 (f: 387526784.000000, d: 6.938097e-310)
xmm6 00007fb817192ed0 (f: 387526336.000000, d: 6.938097e-310)
xmm7 00007fb817192b70 (f: 387525504.000000, d: 6.938097e-310)
xmm8 00007fb8171aabb0 (f: 387623872.000000, d: 6.938097e-310)
xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
Module=/home/simon/JVMs/openj9/openj9-openjdk-jdk11/build/linux-x86_64-normal-server-release/images/jdk/lib/default/libj9jit29.so
Module_base_address=00007FB83717F000

Method_being_compiled=Test.vMeth(I)V
Target=2_90_20230512_000000 (Linux 5.15.0-71-generic)
CPU=amd64 (8 logical CPUs) (0x7cda7a000 RAM)
----------- Stack Backtrace -----------
_ZN21TR_BasicDFSetAnalysisIP12TR_BitVectorE15getAnalysisInfoEP12TR_Structure+0x4 (0x00007FB837745CE4 [libj9jit29.so+0x5c6ce4])
_ZN24TR_BackwardDFSetAnalysisIP12TR_BitVectorE22analyzeRegionStructureEP18TR_RegionStructureb+0x66a (0x00007FB83773CF5A [libj9jit29.so+0x5bdf5a])
_ZN24TR_BackwardDFSetAnalysisIP12TR_BitVectorE31analyzeNodeIfSuccessorsAnalyzedEP18TR_RegionStructureRS0_S5_+0x45a (0x00007FB83773A8DA [libj9jit29.so+0x5bb8da])
_ZN24TR_BackwardDFSetAnalysisIP12TR_BitVectorE22analyzeRegionStructureEP18TR_RegionStructureb+0x87d (0x00007FB83773D16D [libj9jit29.so+0x5be16d])
_ZN21TR_BasicDFSetAnalysisIP12TR_BitVectorE15performAnalysisEP12TR_Structureb+0x7c (0x00007FB83775209C [libj9jit29.so+0x5d309c])
_ZN11TR_Liveness7performEP12TR_Structure+0xc5 (0x00007FB8378280E5 [libj9jit29.so+0x6a90e5])
_ZN26TR_GlobalRegisterAllocator7performEv+0x12cc (0x00007FB8377B9A8C [libj9jit29.so+0x63aa8c])
_ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii+0x7b7 (0x00007FB8378E9E27 [libj9jit29.so+0x76ae27])
_ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii+0xd79 (0x00007FB8378EA3E9 [libj9jit29.so+0x76b3e9])
_ZN3OMR9Optimizer8optimizeEv+0x1db (0x00007FB8378EB6FB [libj9jit29.so+0x76c6fb])
_ZN3OMR11Compilation7compileEv+0x925 (0x00007FB8376DF8D5 [libj9jit29.so+0x5608d5])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadPNS_11CompilationEP17TR_ResolvedMethodR11TR_J9VMBaseP19TR_OptimizationPlanRKNS_16SegmentAllocatorE+0x4dd (0x00007FB8372CA46D [libj9jit29.so+0x14b46d])
_ZN2TR28CompilationInfoPerThreadBase14wrappedCompileEP13J9PortLibraryPv+0x314 (0x00007FB8372CB4D4 [libj9jit29.so+0x14c4d4])
omrsig_protect+0x1e3 (0x00007FB83D664F53 [libj9prt29.so+0x29f53])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadP21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x309 (0x00007FB8372C8BE9 [libj9jit29.so+0x149be9])
_ZN2TR24CompilationInfoPerThread12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x1c0 (0x00007FB8372C9230 [libj9jit29.so+0x14a230])
_ZN2TR24CompilationInfoPerThread14processEntriesEv+0x3b3 (0x00007FB8372C7D33 [libj9jit29.so+0x148d33])
_ZN2TR24CompilationInfoPerThread3runEv+0x42 (0x00007FB8372C8232 [libj9jit29.so+0x149232])
_Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0x82 (0x00007FB8372C82E2 [libj9jit29.so+0x1492e2])
omrsig_protect+0x1e3 (0x00007FB83D664F53 [libj9prt29.so+0x29f53])
_Z21compilationThreadProcPv+0x1cf (0x00007FB8372C871F [libj9jit29.so+0x14971f])
thread_wrapper+0x162 (0x00007FB83DC16322 [libj9thr29.so+0xe322])
start_thread+0xd9 (0x00007FB83E93F609 [libpthread.so.0+0x8609])
clone+0x43 (0x00007FB83EA9B133 [libc.so.6+0x11f133])
---------------------------------------
0xdaryl commented 1 year ago

@jmesyou : can you investigate this intermittent crash in the optimizer please? The user attached a standalone test case and some failure artifacts to the issue.

0xdaryl commented 1 year ago

Moving this out to 0.41 to accommodate resource schedules.

jmesyou commented 1 year ago

Unable to reproduce this failure yet

connglli commented 1 year ago

@jmesyou Perhaps check the log file I've put into the links.

jmesyou commented 1 year ago

@connglli I'm able to reproduce the error on the same commits you reported:

openjdk version "11.0.20-internal" 2023-07-18
OpenJDK Runtime Environment (build 11.0.20-internal+0-adhoc..openj9-openjdk-jdk11)
Eclipse OpenJ9 VM (build master-8aa8676, JRE 11 Linux amd64-64-Bit Compressed References 20230512_000000 (JIT enabled, AOT enabled)
OpenJ9   - 8aa8676
OMR      - 779c51b
JCL      - ee54452 based on jdk-11.0.20+2)

But, so far, still unable to reproduce the exceptions on HEAD. Perhaps some of these failures were fixed, will investigate further 🤔

connglli commented 1 year ago

Thanks @jmesyou. If you cannot reproduce on HEAD, I suppose this bug is fixed by some commits in between? But anyway, I'll check them again when I'm available (and sorry I'm on my vacation so I'm not that on call). I'll consider closing this issue and deem it as fixed if I cannot reproduce them either.

connglli commented 1 year ago

Perhaps we can try git bisect to find the exact commit that might fix this issue (even though it's a little bit time-consuming) if we cannot reproduce it.

jmesyou commented 11 months ago

Hi @connglli, I'm going to close the issue since it's stale. Efforts to reproduce it have not been successful on my end. If you find that the issue persists, please feel free to reopen this issue.

connglli commented 11 months ago

Sure it's okay, perhaps it's already fixed. I'll reopen it once I can reproduced it again.

hzongaro commented 11 months ago

@jmesyou, I was able to reproduce this failure with source as of a week ago, so I'm going to reopen this one.

Some of these issues can be very difficult to reproduce - one person sees repeated failures, and another cannot reproduce it.

hzongaro commented 11 months ago

I've uploaded a jitdump that was produced during one of the crashes that I saw. I believe the problem occurs after Tree Simplifier eliminates a switch for which its able to determine the value.

After that optimization, I think there are problems with the structure for the method, although the CFG looks correct to me. In particular, 4 appears as both a block and an acyclic region within region 0, and the acyclic region version of 4 is seen as having no successor.

<structure>
      0 [0x7f4ef7204960] Acyclic region
         Subgraph: (* = exit edge)
               (0x7f4ef7204a30:0x7f4ef7202280)0 --> 2(0x7f4ef7204ac0)
               (0x7f4ef7204ac0:0x7f4ef72021c0)2 --> 36(0x7f4ef7204c70) 29(0x7f4ef7204ba0)
               (0x7f4ef7204ba0:0x7f4ef7201b00)29 --> 1(0x7f4ef7204d60)
               (0x7f4ef7204d60:0x7f4ef7202220)1 -->
               (0x7f4ef7204c70:0x7f4ef7201a40)36 --> 4(0x7f4ef7204e70)
               (0x7f4ef7204e70:0x7f4ef7202f20)4 -->
               (0x7f4ef72051b0:0x7f4ef7201980)4 --> 1(0x7f4ef7204d60)
         0 [0x7f4ef7202280] Block
         2 [0x7f4ef72021c0] Block
         29 [0x7f4ef7201b00] Block
         1 [0x7f4ef7202220] Block
         36 [0x7f4ef7201a40] Block
         4 [0x7f4ef7202f20] Acyclic region
            Subgraph: (* = exit edge)
         4 [0x7f4ef7201980] Block
</structure>

I've seen the crash occur in various places, but in each case I see a structure like this after one of the passes of Tree Simplifier, so I don't think a core file would be of much use.

hzongaro commented 8 months ago

@jmesyou, you might want to try looking at how the structure changes as the various dead cases in the TR::table are eliminated. If I'm understanding correctly, I think we end up with something where the Improper Region that contains the table eventually has its entry node being followed only by an exit from the region - the one case that the table will execute. At some point after that in Tree Simplifier, the entry node for the region is merged with that exit node, and that results in the entry node no longer being part of the region.

I don't know for certain whether the fact that the entry is no longer part of the region is what ends up causing trouble later, or how that situation is ordinarily handled in updating structures, but it might be something to look at.

hzongaro commented 5 months ago

With more recent builds, I found I needed to run with the environment variable TR_EnableExpensiveOptsAtWarm=1 set.

It seems that running lastLoopVersioner sets the stage for the problem to be exposed, but since pull request #18682 was merged, that optimization will not usually be run at warm. Setting the TR_EnableExpensiveOptsAtWarm environment variable forces lastLoopVersioner to be run at the warm optLevel.