Type Instability Using `sum()` as the Kernel Function

RoyiAvital commented 1 year ago

In the following code there is type instability of the map():

using StaticKernels;
numRows = 400;
numCols = 400;
mI      = rand(numRows, numCols);

kernelRadius = 5;
kernelLen    = (2 * kernelRadius) + 1;
kernelNumPx  = Float64(kernelLen * kernelLen);

mK = Kernel{(-kernelRadius:kernelRadius, -kernelRadius:kernelRadius)}(@inline mW -> (sum(mW) / kernelNumPx));

typeof(mI)
typeof(map(mK, mI))
typeof(sum(mI) / kernelNumPx)

The output is given by:

julia> typeof(mI)
Matrix{Float64} (alias for Array{Float64, 2})

julia> typeof(map(mK, mI))
Matrix{Any} (alias for Array{Any, 2})

julia> typeof(sum(mI) / kernelNumPx)
Float64

stev47 commented 1 year ago

right, type inference through Base.promote_op sometimes gives up early.

Quick fix for you: annotate your window function return type: mW -> (sum(mW) / kernelNumPx)::Float64
What we could do in StaticKernels: actually execute the window function once to get some type. This can be inaccurate if the function may return different types at different locations (e.g. at the boundary) and a bad idea if the window function has other side effects, so I'm not sure about that.

Other suggestions welcome.

RoyiAvital commented 1 year ago

I think if I use the in place version it won't have such issue. So probably, performance wise, I should pre allocate the output array and that's it.

RoyiAvital commented 1 year ago

By the way, the annotation you suggested mW -> (sum(mW) / kernelNumPx)::Float64, it is not something that helps Base.promote_op, right? It just add a final step to the function which is conversion to Float64?

So if we have mO = map(mK, mI); it basically equivalent to mO = Float64.(map(mK, mI));, right?

stev47 commented 1 year ago

I think if I use the in place version it won't have such issue.

Not in your output container type, but the executed code might still by type-unstable before implicitly converting to the container output.

So if we have mO = map(mK, mI); it basically equivalent to mO = Float64.(map(mK, mI));, right?

No, the type annotation on your window function helps the compiler to generate fast code, while converting the output with Float64.(...) happens after the slow code has already run.

RoyiAvital commented 1 year ago

No, the type annotation on your window function helps the compiler to generate fast code, while converting the output with Float64.(...) happens after the slow code has already run.

I was under the impression function types only means the last step of the function will be conver() if the type doesn't match. How come it is different in this case?

See the comment by Steven G. Johnson:

stev47 commented 1 year ago

It is exactly as you say: return-type declarations are a convert before returning the value. No validation but they can still help type inference (provided you don't have a buggy convert implemented), try it out yourself in your example.

Also in this case, the type annotation is not on the return type but on the last value computed before returning. See here:

julia> convert(Float64, 1)
1.0

julia> (x -> x::Float64)(1)
ERROR: TypeError: in typeassert, expected Float64, got a value of type Int64

Nevertheless, a return type type annotation is sufficient as well in your example from above:

function g(mW)::Float64
    sum(mW) / kernelNumPx
end
mK = Kernel{(-kernelRadius:kernelRadius, -kernelRadius:kernelRadius)}(g);
typeof(map(mK, mI)) # Matrix{Float64} (alias for Array{Float64, 2})

RoyiAvital commented 1 year ago

So in the case above your way of doing mW -> (sum(mW) / kernelNumPx)::Float64 won't use any convert() but make the inference better hence performance will be better?

That's great!

I just love this package.

stev47 commented 1 year ago

So in the case above your way of doing mW -> (sum(mW) / kernelNumPx)::Float64 won't use any conver() but make the inference better hence performance will be better?

yes, it is called "type assertion" and is a standard Julia feature.

stev47 / StaticKernels.jl

Type Instability Using `sum()` as the Kernel Function #11