mcabbott / AxisKeys.jl

🎹
MIT License
149 stars 28 forks source link

Draft: Faster `wrapdims` over tables #119

Closed rofinn closed 2 years ago

rofinn commented 2 years ago

Rewrites populate! to operate columnwise over tables. Also, requires manually calling findall with the axiskeys and columns. With these changes the wrapdims call is ~5x faster and allocations 25% of the memory.

Old:

julia> @benchmark wrapdims(df, :value, :time, :loc, :id)
BenchmarkTools.Trial: 163 samples with 1 evaluation.
 Range (min … max):  29.176 ms … 32.762 ms  ┊ GC (min … max): 0.00% … 8.68%
 Time  (median):     29.836 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   30.757 ms ±  1.356 ms  ┊ GC (mean ± σ):  3.87% ± 4.21%

    ▁▇▄▄█▃▆                                      ▇▁▃ ▄   ▁
  ▄▇██████████▁▆▁▁▃▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▄▃██████▇▆█▇▄▄ ▃
  29.2 ms         Histogram: frequency by time        32.7 ms <

 Memory estimate: 19.19 MiB, allocs estimate: 561688.

New:

julia> @benchmark wrapdims(df, :value, :time, :loc, :id)
BenchmarkTools.Trial: 790 samples with 1 evaluation.
 Range (min … max):  6.067 ms …   7.721 ms  ┊ GC (min … max): 0.00% … 17.65%
 Time  (median):     6.171 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.326 ms ± 403.137 μs  ┊ GC (mean ± σ):  2.25% ±  5.30%

  ▁▃▆█▇▅▃▁   ▁                                       ▁▂
  █████████▇▇█▆▁▇▄▁▄▁▁▄▁▁▄▄▁▁▁▅▁▁▁▄▄▁▁▄▄▁▁▁▁▁▁▁▁▁▁▄▆█████▇▇▆▆ ▇
  6.07 ms      Histogram: log(frequency) by time      7.53 ms <

 Memory estimate: 5.15 MiB, allocs estimate: 4697.
rofinn commented 2 years ago

Nightly failure seems to be pre-existing.

rofinn commented 2 years ago

Hmm, it seems like this is only faster in some cases so far. I wonder if the performance of this function is somehow related to the relative sizes of the dimensions?

rofinn commented 2 years ago

Closing in favour of #126 which reduces the allocations by another order of magnitude and is faster in all cases I tried.