JuliaData / IndexedTables.jl

Flexible tables with ordered indices
https://juliadb.org
MIT License
120 stars 38 forks source link

Replace `Base.permute!!` with `permute!` #299

Closed LilithHafner closed 9 months ago

LilithHafner commented 1 year ago

In Julia 1.9.0, permute! is typically faster that the internal method Base.permute!!, thanks to https://github.com/JuliaLang/julia/pull/44941.

For indexed tables in particular, using the internal method Base.permute!! seems to give significant performance improvements for small arrays in exchange for moderate regressions for huge arrays when compared to permute! in microbenchmarks, though when doing slightly larger benchmarks using these functions as they are used in IndexedTables.jl, the improvements for Base.permute!! disappear.

using Random, BenchmarkTools, IndexedTables

small = table(rand(10), rand(10));
x = rand(10_000);
y = [randstring() for _ in 1:10_000];
big = table(x, y, rand(ComplexF64, 10_000), x, x, y);
x = rand(1_000_000);
y = [randstring() for _ in 1:1_000_000];
z = rand(ComplexF64, 1_000_000);
huge = table(x, y, x, z, z, x, z, x, y, y, z, x, y);

for it in (small, big, huge)
    perm0 = randperm(length(it))
    perm = similar(perm0)
    @btime Base.permute!!(IndexedTables.refs(rows($it)), copyto!($perm, $perm0))
    @btime permute!(IndexedTables.refs(rows($it)), copyto!($perm, $perm0))
end
# 19.642 ns (0 allocations: 0 bytes)
# 60.783 ns (2 allocations: 288 bytes)
# 88.167 μs (0 allocations: 0 bytes)
# 92.000 μs (12 allocations: 547.16 KiB)
# 146.891 ms (0 allocations: 0 bytes)
# 106.844 ms (26 allocations: 129.70 MiB)

for it in (small, big, huge)
    @btime Base.permute!!(IndexedTables.refs(rows($it)), sortperm(rows($it))) setup=(shuffle!(rows($it))) evals=1;
    @btime permute!(IndexedTables.refs(rows($it)), sortperm(rows($it))) setup=(shuffle!(rows($it))) evals=1;
end
# 166.000 ns (2 allocations: 208 bytes)
# 167.000 ns (4 allocations: 496 bytes)
# 365.708 μs (5 allocations: 156.41 KiB)
# 368.417 μs (17 allocations: 703.56 KiB)
# 188.198 ms (5 allocations: 15.26 MiB)
# 150.980 ms (31 allocations: 144.96 MiB)

On balance, I would say permute! and Base.permute!! are comparable. In other words, I don't think using Base.permute!! provides any benefit, much less enough to warrant using an internal method of Base when a public method is available.

This PR changes the use of an internal method with a complex implementation to a public method with a simple implementation, which is a win for maintainability and compile time.

LilithHafner commented 1 year ago

Bump