Slowdown with views in 0.12.119 #428

ffreyer opened 2 years ago

ffreyer commented 2 years ago

I have a reimplementation of Linearlagebra.qr in my codebase which uses @turbo. In it there is

using LinearAlgebra, LoopVectorization, BenchmarkTools

function reflectorApply!(x::AbstractVector{<: Real}, τ::Real, A::StridedMatrix{<: Real})
    m, n = size(A)
    @inbounds for j = 1:n
        # dot
        vAj = A[1, j]
        @turbo for i = 2:m
            vAj += conj(x[i]) * A[i, j]

        vAj = conj(τ)*vAj

        # ger
        A[1, j] -= vAj
        @turbo for i = 2:m
            A[i, j] -= x[i]*vAj
    return A

which is called with two views into the same array:

input = rand(64, 64)
n = 64
j = 17
τ = 1.7
x = LinearAlgebra.view(input, j:n, j)
y = LinearAlgebra.view(input, j:n, j+1:n)
@benchmark reflectorApply!($x, $τ, $y)

I've noticed that after triggering a package update things have been running more slowly and Identified that the problem is here. Trying some different version of LoopVectorization pointed showed that v0.12.119 had made this much slower:

# median times via @benchmark, after julia restart
# no turbo:  2.9µs
# 0.12.124: 48µs
# 0.12.120: 46µs
# 0.12.119: 46µs
# 0.12.118:  0.72µs
# 0.12.115:  0.74µs
# 0.12.110:  0.72µs
# 0.12.100:  0.71µs

Rewriting this to use no views restores the performance from before 0.12.119.

chriselrod commented 2 years ago

Hmm, I cannot reproduce...

julia> @benchmark reflectorApply!($x, $τ, $y)
BenchmarkTools.Trial: 10000 samples with 171 evaluations.
 Range (min … max):  630.281 ns … 855.865 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     635.731 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   637.356 ns ±   5.920 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▂▃▃▃▄▄▅▇█████████▇▇▅▄▄▃▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▃▃▃▃▃▃▃▃▃▃▃▃▂ ▃
  630 ns           Histogram: frequency by time          654 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

(lvdev) pkg> st -m LoopVectorization
Status `~/Documents/progwork/julia/env/lvdev/Manifest.toml`
  [bdcacae8] LoopVectorization v0.12.125 `~/.julia/dev/LoopVectorization`

julia> versioninfo()
Julia Version 1.9.0-DEV.1130
Commit c80316e125* (2022-08-15 13:05 UTC)
Platform Info:
  OS: Linux (x86_64-redhat-linux)
  CPU: 36 × Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.5 (ORCJIT, skylake-avx512)
  Threads: 18 on 36 virtual cores
chriselrod commented 2 years ago

Mind showing me the

@code_typed reflectorApply!(x, τ, y)


@code_native debuginfo = :none syntax = :intel reflectorApply!(x, τ, y)
chriselrod commented 2 years ago

There's basically only one change between 0.12.118 and 0.12.119: https://github.com/JuliaSIMD/LoopVectorization.jl/compare/v0.12.118...v0.12.119 so if the change were in LoopVectorization.jl itself instead of a dependency, there is only one place I have to look.

However, I cannot reproduce a performance problem with either AVX512 or AVX2.

Additionally, that change should be irrelevant here. So, more likely there's a problem in a dependency somewhere that in LV itself, and switching versions changed the dependencies.

ffreyer commented 2 years ago


Julia Version 1.7.0
Commit 3bf9d17731 (2021-11-30 12:12 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)

I put code_typed and native in a gist since they're quite long: https://gist.github.com/ffreyer/d8427ec1cf4215f69efed8a30f5ec04d

I checked this on the login node of a cluster now as well and I don't get a slow down there. So maybe it's specific to julia 1.7 or my cpu?

julia> versioninfo()
Julia Version 1.7.1
Commit ac5cc99908* (2021-12-22 19:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake-avx512)

# with LoopVectorization 0.12.125
julia> @benchmark reflectorApply!($x, $τ, $y)
BenchmarkTools.Trial: 10000 samples with 165 evaluations.
 Range (min … max):  651.345 ns …  2.410 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     692.697 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   703.326 ns ± 64.135 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▅█  ▁▂▁ ▁▄▆   ▁    ▃▅▆  ▁▂   ▁                       ▃       ▁
  ██▃████▆████▆▇██▆▇████▆▇██▇▇███▇▇▇▇▇▆▅▆▄▆▆▅▄▄▂▄▃▂▃▂▃▄█▅▂▂▅▆▄ █
  651 ns        Histogram: log(frequency) by time       914 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.
chriselrod commented 2 years ago

From your code:

        movabs  rax, offset StrideIndex
        call    rax
        mov     r13, qword ptr [r13]
        mov     rax, qword ptr [rsp + 96]
        add     rax, r13
        mov     qword ptr [rsp + 104], rax
        mov     rdi, r12
        mov     rsi, qword ptr [rsp + 32]
        movabs  rax, offset StrideIndex
        call    rax


│            invoke LayoutPointers.StrideIndex(x::SubArray{Float64, 1, Matrix{Float64}, Tuple{UnitRange{Int64}, Int64}, true})::ArrayInterface.StrideIndex{1, (1,), 1, Tuple{Static.StaticInt{1}}, Tuple{Static.StaticInt{1}}}
│     %136 = Base.getfield(A, :parent)::Matrix{Float64}
│     %137 = $(Expr(:foreigncall, :(:jl_array_ptr), Ptr{Float64}, svec(Any), 0, :(:ccall), :(%136)))::Ptr{Float64}
│     %138 = Base.getfield(A, :parent)::Matrix{Float64}
│     %139 = Base.getfield(A, :indices)::Tuple{UnitRange{Int64}, UnitRange{Int64}}
│     %140 = LayoutPointers.getfield(%139, 1, false)::UnitRange{Int64}
│     %141 = Base.getfield(%140, :start)::Int64
│     %142 = Base.sub_int(%141, 1)::Int64
│     %143 = Core.getfield(%139, 2)::UnitRange{Int64}
│     %144 = Base.getfield(%143, :start)::Int64
│     %145 = Base.sub_int(%144, 1)::Int64
│     %146 = Base.arraysize(%138, 1)::Int64
│            Base.arraysize(%138, 2)::Int64
│     %148 = Base.mul_int(1, %146)::Int64
│     %149 = Base.mul_int(%142, 1)::Int64
│     %150 = Base.mul_int(%145, %148)::Int64
│     %151 = Base.add_int(%149, %150)::Int64
│     %152 = Base.mul_int(8, %151)::Int64
│     %153 = Core.bitcast(Core.UInt, %137)::UInt64
│     %154 = Base.bitcast(UInt64, %152)::UInt64
│     %155 = Base.add_ptr(%153, %154)::UInt64
│     %156 = Core.bitcast(Ptr{Float64}, %155)::Ptr{Float64}
│     %157 = invoke LayoutPointers.StrideIndex(A::SubArray{Float64, 2, Matrix{Float64}, Tuple{UnitRange{Int64}, UnitRange{Int64}}, false})::ArrayInterface.StrideIndex{2, (1, 2), 1, Tuple{Static.StaticInt{1}, Int64}, Tuple{Static.StaticInt{1}, Static.StaticInt{1}}}

That is bad.

chriselrod commented 2 years ago

What do you get for ] st -m ArrayInterface?

ffreyer commented 2 years ago
(@v1.7) pkg> st -m ArrayInterface
      Status `~/.julia/environments/v1.7/Manifest.toml`
  [4fba245c] ArrayInterface v6.0.22
ffreyer commented 2 years ago

And no slowdown in 1.8.0

chriselrod commented 2 years ago

I'm guessing it is because StrideIndex is not inlining

ffreyer commented 2 years ago

I also tried

(@v1.7) pkg> activate .
  Activating new project at `~/Documents/julia source`
(julia source) pkg> add LoopVectorization BenchmarkTools

and I still get a slowdown in 1.7. The package version match, except for some packages getting version numbers in 1.8 while not having any in 1.7 (ArgTools, Downloads, ...)

ffreyer commented 2 years ago

With https://github.com/JuliaArrays/ArrayInterface.jl/pull/343:


I'm fine with switching to the view-less version/upgrading Julia too

chriselrod commented 2 years ago

Could you share the code typed? That looks really bad, but I don't know what functions are being called.

ffreyer commented 2 years ago

I updated the gist

chriselrod commented 2 years ago

@Tokazama do you want to make sure ArrayInterfaceCore.is_splat_index is elided?

chriselrod commented 2 years ago

I can confirm that I see the regression on 1.7.3:

julia> @benchmark reflectorApply!($x, $τ, $y)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   77.337 μs … 142.382 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     127.234 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   127.364 μs ±   2.679 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▂▂▁▂▁▂▂▂▂▂▂▂▂▁▂▂▂▁▂▂▂▂▂▂▂▂▂▁▂▂▂▂▁▁▂▂▂▂▂▂▁▂▂▂▁▁▂▂▁▂▁▂▂▂▄██▄▄▃ ▂
  77.3 μs          Histogram: frequency by time          131 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> versioninfo()
Julia Version 1.7.3
Commit 742b9abb4d (2022-05-06 12:58 UTC)
Platform Info:
  OS: Linux (x86_64-redhat-linux)
  CPU: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
chriselrod commented 2 years ago
35 ──││││││ %87  = ArrayInterfaceCore.is_splat_index::Core.Const(ArrayInterfaceCore.is_splat_index)
│    ││││││┌ @ /home/chriselrod/.julia/dev/ArrayInterface/lib/ArrayInterfaceCore/src/ArrayInterfaceCore.jl:39 within `map_tuple_type`
│    │││││││ %88  = %new(ArrayInterfaceCore.var"#1#2"{typeof(ArrayInterfaceCore.is_splat_index), DataType}, %87, Tuple{UnitRange{Int64}, Int64})::Core.Const(ArrayInterfaceCore.var"#1#2"{typeof(ArrayInterfaceCore.is_splat_index), DataType}(ArrayInterfaceCore.is_splat_index, Tuple{UnitRange{Int64}, Int64}))
│    │││││││┌ @ ntuple.jl:49 within `ntuple`
│    ││││││││        invoke %88(1::Int64)
│    ││││││││        invoke %88(2::Int64)


   %89 = invoke #1(::Int64)::Core.Const(false)
   %90 = invoke #1(::Int64)::Core.Const(false)
   %96 = invoke #1(::Int64)::Core.Const(false)
   %97 = invoke #1(::Int64)::Core.Const(false)
   %101 = invoke #1(::Int64)::Core.Const(false)
   %102 = invoke #1(::Int64)::Core.Const(false)
   %132 = invoke #1(::Int64)::Core.Const(false)
   %133 = invoke #1(::Int64)::Core.Const(false)
   %139 = invoke #1(::Int64)::Core.Const(false)
   %140 = invoke #1(::Int64)::Core.Const(false)
   %144 = invoke #1(::Int64)::Core.Const(false)
   %145 = invoke #1(::Int64)::Core.Const(false)
   %378 = invoke #1(::Int64)::Core.Const(false)
   %379 = invoke #1(::Int64)::Core.Const(false)
   %385 = invoke #1(::Int64)::Core.Const(false)
   %386 = invoke #1(::Int64)::Core.Const(false)
   %390 = invoke #1(::Int64)::Core.Const(false)
   %391 = invoke #1(::Int64)::Core.Const(false)
   %431 = invoke #1(::Int64)::Core.Const(false)
   %432 = invoke #1(::Int64)::Core.Const(false)
   %438 = invoke #1(::Int64)::Core.Const(false)
   %439 = invoke #1(::Int64)::Core.Const(false)
   %443 = invoke #1(::Int64)::Core.Const(false)
   %444 = invoke #1(::Int64)::Core.Const(false)

Look at all these useless function calls known to return false.

chriselrod commented 2 years ago

@ffreyer Try the ArrayInterfaceCore version from that PR, with the map_tuple_type that specializes. It fixed the regression for me.

ffreyer commented 2 years ago

With that I get

julia> @benchmark reflectorApply!($x, $τj, $y)
BenchmarkTools.Trial: 10000 samples with 71 evaluations.
 Range (min … max):  847.521 ns …   3.534 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     921.204 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   972.610 ns ± 190.748 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   █ ▇▃▂▁                                                        
  ▃█▇████▆▃▂▁▁▁▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  848 ns           Histogram: frequency by time         1.84 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

which is much better but still a bit slower than it used to be

chriselrod commented 2 years ago

Yeah. The @code_typed still shows a bunch of crap coming from @nospecializes

   35 ── %87  = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 1)::Union{Type{UnitRange{Int64}}, Type{Int64}}             ││╻╷╷╷╷╷╷╷      bytestrideindex
   │     %88  = (isa)(%87, Type{UnitRange{Int64}})::Bool                                                                                │││┃││││││       StrideIndex
   └────        goto #37 if not %88                                                                                                     ││││┃│││││        stride_rank
   36 ──        goto #40                                                                                                                │││││┃││││         to_parent_dims
   37 ── %91  = (isa)(%87, Type{Int64})::Bool                                                                                           ││││││┃│││          IndicesInfo
   └────        goto #39 if not %91                                                                                                     │││││││┃││           map_tuple_type
   38 ──        goto #40                                                                                                                ││││││││┃│            ntuple
   39 ──        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       │││││││││┃             #36
   └────        unreachable                                                                                                             ││││││││││
   40 ┄─        goto #41                                                                                                                ││││││││││
   41 ── %97  = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 2)::Union{Type{UnitRange{Int64}}, Type{Int64}}             ││││││││││
   │     %98  = (isa)(%97, Type{UnitRange{Int64}})::Bool                                                                                ││││││││││
   └────        goto #43 if not %98                                                                                                     ││││││││││
   42 ──        goto #46                                                                                                                ││││││││││
   43 ── %101 = (isa)(%97, Type{Int64})::Bool                                                                                           ││││││││││
   └────        goto #45 if not %101                                                                                                    ││││││││││
   44 ──        goto #46                                                                                                                ││││││││││
   45 ──        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       ││││││││││
   └────        unreachable                                                                                                             ││││││││││
   46 ┄─        goto #47                                                                                                                ││││││││││
   47 ──        goto #48                                                                                                                │││││││││
   48 ──        goto #49                                                                                                                ││││││││
   49 ── %109 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 1)::Union{Type{UnitRange{Int64}}, Type{Int64}}             ││││││││╻╷            ntuple
   │     %110 = (isa)(%109, Type{UnitRange{Int64}})::Bool                                                                               │││││││││┃             #36
   └────        goto #51 if not %110                                                                                                    ││││││││││
   50 ──        goto #54                                                                                                                ││││││││││
   51 ── %113 = (isa)(%109, Type{Int64})::Bool                                                                                          ││││││││││
   └────        goto #53 if not %113                                                                                                    ││││││││││
   52 ──        goto #54                                                                                                                ││││││││││
   53 ──        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       ││││││││││
   └────        unreachable                                                                                                             ││││││││││
   54 ┄─        goto #55                                                                                                                ││││││││││
   55 ── %119 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 2)::Union{Type{UnitRange{Int64}}, Type{Int64}}             ││││││││││
   │     %120 = (isa)(%119, Type{UnitRange{Int64}})::Bool                                                                               ││││││││││
   └────        goto #57 if not %120                                                                                                    ││││││││││
   56 ──        goto #60                                                                                                                ││││││││││
   57 ── %123 = (isa)(%119, Type{Int64})::Bool                                                                                          ││││││││││
   └────        goto #59 if not %123                                                                                                    ││││││││││
   58 ──        goto #60                                                                                                                ││││││││││
   59 ──        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       ││││││││││
   └────        unreachable                                                                                                             ││││││││││
   60 ┄─        goto #61                                                                                                                ││││││││││
   61 ──        goto #62                                                                                                                │││││││││
   62 ──        goto #63                                                                                                                ││││││││
   63 ──        goto #64                                                                                                                │││││││
   64 ──        goto #65                                                                                                                ││││││
   65 ──        goto #66                                                                                                                │││││
   66 ──        nothing::Nothing                                                                                                        │
   67 ──        nothing::Nothing                                                                                                        │
   68 ── %136 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 1)::Union{Type{UnitRange{Int64}}, Type{Int64}}             │││││╻╷╷╷╷         from_parent_dims
   │     %137 = (isa)(%136, Type{UnitRange{Int64}})::Bool                                                                               ││││││┃│││          IndicesInfo
   └────        goto #70 if not %137                                                                                                    │││││││┃││           map_tuple_type
   69 ──        goto #73                                                                                                                ││││││││┃│            ntuple
   70 ── %140 = (isa)(%136, Type{Int64})::Bool                                                                                          │││││││││┃             #36
   └────        goto #72 if not %140                                                                                                    ││││││││││
   71 ──        goto #73                                                                                                                ││││││││││
   72 ──        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       ││││││││││
   └────        unreachable                                                                                                             ││││││││││
   73 ┄─        goto #74                                                                                                                ││││││││││
   74 ── %146 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 2)::Union{Type{UnitRange{Int64}}, Type{Int64}}             ││││││││││
   │     %147 = (isa)(%146, Type{UnitRange{Int64}})::Bool                                                                               ││││││││││
   └────        goto #76 if not %147                                                                                                    ││││││││││
   75 ──        goto #79                                                                                                                ││││││││││
   76 ── %150 = (isa)(%146, Type{Int64})::Bool                                                                                          ││││││││││
   └────        goto #78 if not %150                                                                                                    ││││││││││
   77 ──        goto #79                                                                                                                ││││││││││
   78 ──        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       ││││││││││
   └────        unreachable                                                                                                             ││││││││││
   79 ┄─        goto #80                                                                                                                ││││││││││
   80 ──        goto #81                                                                                                                │││││││││
   81 ──        goto #82                                                                                                                ││││││││
   82 ── %158 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 1)::Union{Type{UnitRange{Int64}}, Type{Int64}}             ││││││││╻╷            ntuple
   │     %159 = (isa)(%158, Type{UnitRange{Int64}})::Bool                                                                               │││││││││┃             #36
   └────        goto #84 if not %159                                                                                                    ││││││││││
   83 ──        goto #87                                                                                                                ││││││││││
   84 ── %162 = (isa)(%158, Type{Int64})::Bool                                                                                          ││││││││││
   └────        goto #86 if not %162                                                                                                    ││││││││││
   85 ──        goto #87                                                                                                                ││││││││││
   86 ──        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       ││││││││││
   └────        unreachable                                                                                                             ││││││││││
   87 ┄─        goto #88                                                                                                                ││││││││││
   88 ── %168 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 2)::Union{Type{UnitRange{Int64}}, Type{Int64}}             ││││││││││
   │     %169 = (isa)(%168, Type{UnitRange{Int64}})::Bool                                                                               ││││││││││
   └────        goto #90 if not %169                                                                                                    ││││││││││
   89 ──        goto #93                                                                                                                ││││││││││
   90 ── %172 = (isa)(%168, Type{Int64})::Bool                                                                                          ││││││││││
   └────        goto #92 if not %172                                                                                                    ││││││││││
   91 ──        goto #93                                                                                                                ││││││││││
   92 ──        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       ││││││││││
   └────        unreachable                                                                                                             ││││││││││
   93 ┄─        goto #94                                                                                                                ││││││││││
   94 ──        goto #95                                                                                                                │││││││││
   95 ──        goto #96                                                                                                                ││││││││
   96 ──        goto #97                                                                                                                │││││││
   97 ──        goto #98                                                                                                                ││││││
   98 ──        goto #99                                                                                                                │││││
   99 ── %183 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 1)::Union{Type{UnitRange{Int64}}, Type{Int64}}             │││││╻╷╷╷╷╷        strides
   │     %184 = (isa)(%183, Type{UnitRange{Int64}})::Bool                                                                               ││││││┃││││         to_parent_dims
   └────        goto #101 if not %184                                                                                                   │││││││┃│││          IndicesInfo
   100 ─        goto #104                                                                                                               ││││││││┃││           map_tuple_type
   101 ─ %187 = (isa)(%183, Type{Int64})::Bool                                                                                          │││││││││┃│            ntuple
   └────        goto #103 if not %187                                                                                                   ││││││││││┃             #36
   102 ─        goto #104                                                                                                               │││││││││││
   103 ─        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       │││││││││││
   └────        unreachable                                                                                                             │││││││││││
   104 ┄        goto #105                                                                                                               │││││││││││
   105 ─ %193 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 2)::Union{Type{UnitRange{Int64}}, Type{Int64}}             │││││││││││
   │     %194 = (isa)(%193, Type{UnitRange{Int64}})::Bool                                                                               │││││││││││
   └────        goto #107 if not %194                                                                                                   │││││││││││
   106 ─        goto #110                                                                                                               │││││││││││
   107 ─ %197 = (isa)(%193, Type{Int64})::Bool                                                                                          │││││││││││
   └────        goto #109 if not %197                                                                                                   │││││││││││
   108 ─        goto #110                                                                                                               │││││││││││
   109 ─        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       │││││││││││
   └────        unreachable                                                                                                             │││││││││││
   110 ┄        goto #111                                                                                                               │││││││││││
   111 ─        goto #112                                                                                                               ││││││││││
   112 ─        goto #113                                                                                                               │││││││││
   113 ─ %205 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 1)::Union{Type{UnitRange{Int64}}, Type{Int64}}             │││││││││╻╷            ntuple
   │     %206 = (isa)(%205, Type{UnitRange{Int64}})::Bool                                                                               ││││││││││┃             #36
   └────        goto #115 if not %206                                                                                                   │││││││││││
   114 ─        goto #118                                                                                                               │││││││││││
   115 ─ %209 = (isa)(%205, Type{Int64})::Bool                                                                                          │││││││││││
   └────        goto #117 if not %209                                                                                                   │││││││││││
   116 ─        goto #118                                                                                                               │││││││││││
   117 ─        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       │││││││││││
   └────        unreachable                                                                                                             │││││││││││
   118 ┄        goto #119                                                                                                               │││││││││││
   119 ─ %215 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 2)::Union{Type{UnitRange{Int64}}, Type{Int64}}             │││││││││││
   │     %216 = (isa)(%215, Type{UnitRange{Int64}})::Bool                                                                               │││││││││││
   └────        goto #121 if not %216                                                                                                   │││││││││││
   120 ─        goto #124                                                                                                               │││││││││││
   121 ─ %219 = (isa)(%215, Type{Int64})::Bool                                                                                          │││││││││││
   └────        goto #123 if not %219                                                                                                   │││││││││││
   122 ─        goto #124                                                                                                               │││││││││││
   123 ─        Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{}                                       │││││││││││
   └────        unreachable                                                                                                             │││││││││││
   124 ┄        goto #125                                                                                                               │││││││││││
   125 ─        goto #126                                                                                                               ││││││││││

and I suspect this makes it into the generated code, but I'm not sure. There aren't any calls.

Tokazama commented 2 years ago

We might need to revert the changes to flatten_tuples too

Tokazama commented 2 years ago

This is probably one of those instances where I should see what comes out of this when we use @assume_effects so we can still cut down on needless code generation.

chriselrod commented 2 years ago

FWIW, master seems fine. I haven't tried 1.8. But Julia 1.7 is not.

So we could have different versions for >= 1.8 vs not.

Tokazama commented 2 years ago

FWIW, master seems fine. I haven't tried 1.8. But Julia 1.7 is not.

So we could have different versions for >= 1.8 vs not.

I haven't tested this particular issue, but I know from experience that pre 1.8 inference requires a lot more explicitly defined types

chriselrod commented 2 years ago

This isn't even an inference failure. The compiler just isn't doing its job

   82 ── %158 = ArrayInterfaceCore.fieldtype(Tuple{UnitRange{Int64}, Int64}, 1)::Union{Type{UnitRange{Int64}}, Type{Int64}}             ││││││││╻╷            ntuple
   │     %159 = (isa)(%158, Type{UnitRange{Int64}})::Bool 
ffreyer commented 2 years ago

(1) With the new changes from the pr:

julia> @benchmark reflectorApply!($x, $τj, $y)
BenchmarkTools.Trial: 10000 samples with 156 evaluations.
 Range (min … max):  662.442 ns …  1.881 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     739.295 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   754.946 ns ± 87.526 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    █      ▁                                                    
  ▄▁█▅▄█▄██████▆▅▅▄▃▃▃▃▃▃▂▂▂▂▂▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  662 ns          Histogram: frequency by time         1.14 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

vs (2) LoopVectorization@v0.12.18:

@benchmark reflectorApply!($x, $τ, $y)
BenchmarkTools.Trial: 10000 samples with 151 evaluations.
 Range (min … max):  677.364 ns …   1.560 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     709.536 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   742.538 ns ± 103.227 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   █ ▆▇▂▂ ▁  ▁▂▂▁▁▁▁                                            ▁
  ▅█▆███████▆█████████▇▇███▇███▇▇▇▇▇▇▆█▆▇▇▇█▇▇▆▅▅▆▇▆▇▇▇▇▆▄▄▄▅▅▆ █
  677 ns        Histogram: log(frequency) by time       1.17 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

Across multiple benchmarks (1) usually has lower min times, but higher median, mean and max times than (2).

chriselrod commented 2 years ago

If you want to fix things, just @code_typed and look for any Julia function calls (like ArrayInterfaceCore.fieldtype above). If there are any, find out how to get rid of them (probably by encouraging the compiler to specialize, meaning deleting @nospecialize and replacing T::Type with ::Type{T}) where {T} in function signatures.