JuliaSIMD / LoopVectorization.jl

Macro(s) for vectorizing loops.
MIT License
741 stars 68 forks source link

Reduction not found #528

Open chengxo opened 10 months ago

chengxo commented 10 months ago

Hi, here is a MWE of an unexpected error:

using LoopVectorization

function test(a, b)
    #s = 0.0
    @turbo for i in axes(a,1)
        ai = 0.0
        for j in axes(b,2)
            bij = b[i, j]
            ai += bij^2
        end
        ai = exp(ai)
        a[i] = ai
    end
    return nothing
end

a = rand(256)
b = rand(256, 256)
test(a, b)

LoadError: LoadError: "Reduction not found." Stacktrace: [1] reduction_zero @ $HOME.julia\packages\LoopVectorization\7gWfp\src\modeling\costs.jl:649 [inlined] [2] add_reduction_update_parent!(vparents::Vector{LoopVectorization.Operation}, deps::Vector{Symbol}, reduceddeps::Vector{Symbol}, ls::LoopVectorization.LoopSet, parent::LoopVectorization.Operation, instr::LoopVectorization.Instruction, reduction_ind::Int64, elementbytes::Int64) @ LoopVectorization $HOME.julia\packages\LoopVectorization\7gWfp\src\parse\add_compute.jl:255

If I use a[i] = exp(ai) instead of the last two lines, it works. It also works if the last two lines are ai = exp(ai/2); a[i] = ai. I feel the issue is related to your explanation in #498, but honestly I don't fully understand that. My general feeling there is that the error it gives is unnecessary and can be ignored if it can run... Is that correct?

chengxo commented 10 months ago

Here is a variant born out of this MWE with a different goal to produce another unexpected error.

function test(a, b)
    ai = 0.0
    @tturbo for i in axes(b,2)
        ai = 0.0
        for j in axes(b,1)
            bij = b[j, i]
            ai += bij^2 
        end
        aii = a[i]
        aii *= exp(-ai)
        a[i] *= aii
    end
    return nothing
end

a = rand(256)
b = rand(256, 256)
test(a, b)

gives LoadError: AssertionError: More than one parent is a reduction over the vectorized variable. Stacktrace: [1] storeinstr_preprend(op::LoopVectorization.Operation, vloopsym::Symbol) @ LoopVectorization $HOME.julia\packages\LoopVectorization\7gWfp\src\codegen\lower_store.jl:19

For this example, both dimensions of b equal. So at face the following should work (just switching i and j)

  function test(a, b)
    ai = 0.0
    @tturbo for j in axes(b,2)
        ai = 0.0
        for i in axes(b,1)
            bij = b[j, i]
            #cij = c[i, j]
            ai += bij^2 
            #ai -= cij
        end
        aii = a[i]
        aii *= exp(-ai)
        a[i] *= aii
    end
    return nothing
end

a = rand(256)
b = rand(256, 256)
test(a, b)

But it gives the error LoadError: "Reduction not found." Stacktrace: [1] reduce_to_onevecunroll @ $HOME .julia\packages\LoopVectorization\7gWfp\src\modeling\costs.jl:581 [inlined] [2] reduce_to_onevecunroll @$HOME .julia\packages\LoopVectorization\7gWfp\src\modeling\operations.jl:570 [inlined] [3] reduce_to_onevecunroll @ $HOME .julia\packages\LoopVectorization\7gWfp\src\modeling\operations.jl:572 [inlined] [4] parent_op_name!(q::Expr, ls::LoopVectorization.LoopSet, parents_op::Vector{LoopVectorization.Operation}, n::Int64, modsuffix::Int64, suffix::Int64, parents_u₁syms::Vector{Bool}, parents_u₂syms::Vector{Bool}, u₁::Int64, u₂max::Int64, u₂unrolledsym::Bool, op::LoopVectorization.Operation, tiledouterreduction::Int64) @ LoopVectorization $HOME .julia\packages\LoopVectorization\7gWfp\src\codegen\lower_compute.jl:416

chengxo commented 10 months ago

A more general question is, I feel that sometimes writing code in a mathematically equivalent way will makes a working code no longer works if I use @turbo. This makes me feel worried whether my working code actually has some underlying bug / is not doing what it is supposed to do. In general, do you think that the errors LV give are because of subtleties of making everything SIMD (like LV now cannot transform codes if they are written in an unnecessarily complicated way), or because of essential problem in the original code?