Open LilithHafner opened 2 years ago
https://github.com/JuliaLang/julia/issues/31442 was mainly for float case though. (LLVM vectorlize Integer's extrema well).
Some of the performance difference comes from the default pairwise_blocksize(f, op) = 1024
, which might be typemax(Int)
for minimum/maximum/extrema
, as the result should be the same.
The rest difference seems confusing. If I follows mapreduce_impl
style
function __extrema(v)
mn = mx = first(v)
@inbounds if length(v) > 1
mn = min(mn, v[2])
mx = max(mx, v[2])
for i in firstindex(v)+2:lastindex(v)
vi = v[i]
vi < mn && (mn = vi)
mx < vi && (mx = vi)
end
end
mn, mx
end
Then we have
julia> x = rand(Int, 4096);
julia> @btime _extrema($x);
607.910 ns (0 allocations: 0 bytes)
julia> @btime __extrema($x);
682.237 ns (0 allocations: 0 bytes)
julia> @btime extrema($x);
724.627 ns (0 allocations: 0 bytes)
julia> Base.pairwise_blocksize(::Base.ExtremaMap, ::typeof(Base._extrema_rf)) = typemax(Int)
julia> @btime extrema($x);
688.000 ns (0 allocations: 0 bytes) # quite close to `__extrema` !!!
Related: #34790
This implementation seems to be slightly faster than
extrema
:_Originally posted by @LilithHafner in https://github.com/JuliaLang/julia/pull/44230#discussion_r825507738_