Closed piever closed 5 years ago
Performance is fixed now:
using IndexedTables, SparseArrays, BenchmarkTools, Random
Random.seed!(666)
idx = Columns(p=rand(1:100, N), q=rand(1:100, N))
t = NDSparse(idx, rand(N))
t2 = NDSparse(Columns(q=rand(1:100, N)), rand(N))
@btime broadcast(+, $t, $t2)
# master 296.955 μs (200 allocations: 941.16 KiB)
# PR 289.171 μs (741 allocations: 898.08 KiB)
Random.seed!(666)
S = sprand(1000, 1000,.1)
v = rand(1000)
nd = convert(NDSparse, S)
ndv = convert(NDSparse,v)
@btime broadcast(*, $nd, $ndv)
#master 6.247 ms (113 allocations: 6.01 MiB)
#PR 6.421 ms (6119 allocations: 7.01 MiB)
I'm leaving the file with the benchmarks I was using for join
and broadcast
in the benchmarks folder so we can go back to it if we want to optimize further.
This simplifies broadcast a bit and avoids relying on type inference for the result. As usual with this translation, I need to look at performance a bit before merging.