Open mikmoore opened 1 year ago
I was shortsighted in limiting this discussion to tuples. Unsurprisingly, it also applies to Array
with nonstandard reduction functions. Taken from this Discourse thread:
julia> x = rand(512);
julia> @btime reduce((x,y) -> x<y ? x : y, $x, init=Inf)
395.203 ns (0 allocations: 0 bytes)
0.0004658392690449764
julia> @btime reduce((x,y) -> @fastmath(x < y) ? x : y, $x, init=Inf)
33.046 ns (0 allocations: 0 bytes)
0.0004658392690449764
@fastmath
removes extra checks, so the comparison isn't perfectly fair, but we should allow it anyway.
The speedups in both examples are purely due to re-association. There aren't any extra checks involved and the core functionality is implemented using native x86-64 instructions (vaddsd
/vaddpd
for +
, vminsd
/vminpd
for x<y ? x : y
) . The only differences are SIMD and (in the Vector case) unrolling in the re-associated versions.
But yes, all the extra transformations that @fastmath
could enable in other situations (e.g., used with min
) are why this should be standalone functionality in reduce
.
reduce
misses the opportunity to re-associate operations when givenNTuple
arguments. This has performance impacts on notable functions such assum(::NTuple)
. One can use@fastmath
to permit re-association in some cases, but I'm never a fan of suggesting@fastmath
to people.You can see the lack of vectorization by comparing
code_native(sum,Tuple{NTuple{16,Float64}};debuginfo=:none)
to the SIMDing versioncode_native(x->@fastmath(reduce(+,x)),Tuple{NTuple{16,Float64}};debuginfo=:none)
.Methods like
sum(::Vector)
already do re-associate (with performance and accuracy benefits).