llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.49k stars 11.78k forks source link

[SCEV / LoopUnroller] crash #83296

Open JonPsson1 opened 7 months ago

JonPsson1 commented 7 months ago

clang -O3 -march=z15 crash_loopunroll.i -o a.out -w -mllvm -unroll-allow-remainder=false -mllvm -unroll-count=4

253 0x000002aa209d718a llvm::ScalarEvolution::getBackedgeTakenCount

crash_loopunroll.i.tar.gz

@serguei-katkov @modiking @nikic

JOE1994 commented 7 months ago

I'm unable to reproduce the crash on compiler explorer

Which version of clang are you using?

NOTE: On compiler explorer, I omitted the -march=z15 option, as including it results in a compiler error as follows:

error: unknown target CPU 'z15'
note: valid target CPU values are: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge, core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake, cooperlake, cannonlake, icelake-client, rocketlake, icelake-server, tigerlake, sapphirerapids, alderlake, raptorlake, meteorlake, sierraforest, grandridge, graniterapids, graniterapids-d, emeraldrapids, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10, barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2, znver3, znver4, x86-64, x86-64-v2, x86-64-v3, x86-64-v4
JonPsson1 commented 7 months ago

I'm unable to reproduce the crash on compiler explorer

Which version of clang are you using?

NOTE: On compiler explorer, I omitted the -march=z15 option, as including it results in a compiler error as follows:

error: unknown target CPU 'z15'
note: valid target CPU values are: nocona, core2, penryn, bonnell, atom, silvermont, slm, goldmont, goldmont-plus, tremont, nehalem, corei7, westmere, sandybridge, corei7-avx, ivybridge, core-avx-i, haswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake, cooperlake, cannonlake, icelake-client, rocketlake, icelake-server, tigerlake, sapphirerapids, alderlake, raptorlake, meteorlake, sierraforest, grandridge, graniterapids, graniterapids-d, emeraldrapids, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10, barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2, znver3, znver4, x86-64, x86-64-v2, x86-64-v3, x86-64-v4

Sorry, it looks like you also need to type the target, like: -target s390x-linux-gnu -march=z15

fhahn commented 7 months ago

It times out on latest trunk on compiler-explorer I think.

@JonPsson1 could you share the full stack trace? Is the issue a stack overflow (which would depend on the system's stack size limit). Building the file generates a huge amount of output (it first unroll the inner loop 4x, then the parent loop 4x and then the parent loop again 4x), but it doesn't crash on ARM64 macOS