JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.73k stars 5.48k forks source link

Tests fail on Julia 1.9.1 with StackOverflowError iff coverage is enabled #50408

Open devmotion opened 1 year ago

devmotion commented 1 year ago

Tests of the master branch of Distances.jl fail with a StackOverflowError on Julia 1.9.1 (on Ubuntu, MacOS, and Windows) if coverage is enabled, and pass successfully without coverage analysis. Tests on Julia 1.0 and nightly succeed even with coverage.

These issues showed up on Github (see, e.g., https://github.com/JuliaStats/Distances.jl/pull/250 and these logs on Ubuntu) and I was only able to reproduce them locally after activating coverage analysis. Also on Github tests pass after disabling coverage, as shown by this commit and the accompanying logs.

Locally these test failures can be reproduced with Julia 1.9.1 by running

julia --startup-file=no -e 'using Pkg; Pkg.activate(; temp=true); Pkg.add("Distances"); Pkg.test("Distances"; coverage=true)'

The tests can be run successfully with

julia --startup-file=no -e 'using Pkg; Pkg.activate(; temp=true); Pkg.add("Distances"); Pkg.test("Distances"; coverage=false)'
vtjnash commented 1 year ago

From the error log there, it looks like something specified an incorrect alignment for an assembly instruction that used the stack memory. It is rather unusual to see that result in all 3 systems failing though. Seems likely a codegen bug.

individual metrics: Error During Test at /home/runner/work/Distances.jl/Distances.jl/test/test_dists.jl:246
  Test threw exception
  Expression: spannorm_dist(x, y) == maximum(xc - vec(yc)) - minimum(xc - vec(yc))
  StackOverflowError:
  Stacktrace:
   [1] macro expansion
     @ /opt/hostedtoolcache/julia/1.9.1/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:478 [inlined]
   [2] macro expansion
     @ ~/work/Distances.jl/Distances.jl/test/test_dists.jl:246 [inlined]
   [3] macro expansion
     @ /opt/hostedtoolcache/julia/1.9.1/x64/share/julia/stdlib/v1.9/Test/src/Test.jl:1498 [inlined]
   [4] top-level scope
     @ ~/work/Distances.jl/Distances.jl/test/test_dists.jl:177
vtjnash commented 1 year ago

This seems to be an LLVM bug. Here is the IR pre-isel, which also crashes llc-15.0.7jl SpanNormDist-peephole.ll.txt

./usr/tools/llc -o /dev/null SpanNormDist-peephole.ll.txt

Bugpoint then did a good job on cutting this down: bugpoint-reduced-simplified.ll.txt bugpoint-reduced-simplified.mir.txt

LLVM (http://llvm.org/):
  LLVM version 15.0.7jl
  Optimized build with assertions.
  Default target: x86_64-linux-gnu
  Host CPU: znver3
andreasnoack commented 6 months ago

Was this one reported upstream?