Closed willow-ahrens closed 4 months ago
performance before:
ID time GC time memory allocations
–––––––––––––––––––––––––––––––– ––––––––––––––– ––––––– ––––––––––––––– –––––––––––
["einsum_spmv_baremetal"] 4.333 ns (5%)
["einsum_spmv_call_overhead"] 18.958 μs (5%) 17.16 KiB (1%) 321
["einsum_spmv_compile_overhead"] 245.632 ms (5%) 102.51 MiB (1%) 2417295
["permutedims(Dense(Dense()))"] 20.908 s (5%) 1.052 s 21.96 GiB (1%) 10854
["permutedims(Dense(Sparse()))"] 207.333 ms (5%) 380.13 MiB (1%) 21747
performance after:
ID time GC time memory allocations
–––––––––––––––––––––––––––––––– ––––––––––––––– ––––––– ––––––––––––––– –––––––––––
["einsum_spmv_baremetal"] 4.333 ns (5%)
["einsum_spmv_call_overhead"] 21.583 μs (5%) 17.56 KiB (1%) 329
["einsum_spmv_compile_overhead"] 246.651 ms (5%) 102.90 MiB (1%) 2429762
["permutedims(Dense(Dense()))"] 272.023 ms (5%) 762.94 MiB (1%) 115
["permutedims(Dense(Sparse()))"] 119.272 ms (5%) 191.27 MiB (1%) 255
fixes #584 fixes #525 improves the performance of permutedims by hopefully getting the intermediate right