MasonProtter / Bumper.jl

Bring Your Own Stack
MIT License
152 stars 6 forks source link

Massive slowdown when running with `--check-bounds=no` #39

Closed sgaure closed 2 weeks ago

sgaure commented 2 months ago

When running julia with --check-bounds=no something goes wrong with Bumper. It should be noted in the docs.

The MWE is the example from the docs:

using Bumper
using BenchmarkTools
using StrideArrays

function f(x)
    # Set up a scope where memory may be allocated, and does not escape:
    @no_escape begin
        # Allocate a `PtrArray` (see StrideArraysCore.jl) using memory from the default buffer.
        y = @alloc(eltype(x), length(x))
        # Now do some stuff with that vector:
        y .= x .+ 1
        sum(y) # It's okay for the sum of y to escape the block, but references to y itself must not do so!
    end
end

@benchmark f(x) setup=(x = rand(1:10, 30))

Starting julia with --check-bounds=auto I get this output:

BenchmarkTools.Trial: 10000 samples with 997 evaluations.
 Range (min … max):  19.837 ns … 41.080 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     19.998 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   20.250 ns ±  1.138 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▇▆▅▄▃▂▁  ▁▁▁   ▁                                          ▂
  ████████████████████▇▇▆█▆▆▅▅▅▆▆▆█▇▆▇▆▆▅▄▅▅▃▅▅▄▅▄▂▂▃▃▄▃▄▅▃▃▄ █
  19.8 ns      Histogram: log(frequency) by time      24.3 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

With --check-bounds=no it is quite a bit slower, and allocating:

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  147.137 μs …  4.958 ms  ┊ GC (min … max): 0.00% … 95.54%
 Time  (median):     152.287 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   156.173 μs ± 87.330 μs  ┊ GC (mean ± σ):  1.91% ±  3.47%

         ▁▁▂▅▆█▆▄▄▂▂▂▂▁▂▁                                       
  ▂▁▃▄▆▇████████████████████▇▆▆▆▆▅▅▅▄▄▄▄▄▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂ ▄
  147 μs          Histogram: frequency by time          166 μs <

 Memory estimate: 49.56 KiB, allocs estimate: 1050.

Julia Version 1.12.0-DEV.606 Commit 6f569c7ba0* (2024-05-27 08:27 UTC) Platform Info: OS: Linux (x86_64-linux-gnu) CPU: 24 × AMD Ryzen Threadripper PRO 5945WX 12-Cores WORD_SIZE: 64 LLVM: libLLVM-17.0.6 (ORCJIT, znver3) Threads: 24 default, 0 interactive, 24 GC (on 24 virtual cores) Environment: JULIA_NUM_THREADS = auto JULIA_EDITOR = emacs -nw

sgaure commented 2 months ago

Ah, it seems this is really an issue with Static.jl used by StrideArrays. It's an ongoing thing, now slated for julia 1.12.

https://github.com/JuliaLang/julia/issues/50985

MasonProtter commented 2 weeks ago

Fixed in #42