willow-ahrens / Finch.jl

Sparse tensors in Julia and more! Datastructure-driven array programing language.
http://willowahrens.io/Finch.jl/
MIT License
159 stars 15 forks source link

Wma/permute dims perf #594

Closed willow-ahrens closed 3 months ago

willow-ahrens commented 3 months ago

fixes #584 fixes #525 improves the performance of permutedims by hopefully getting the intermediate right

willow-ahrens commented 3 months ago

performance before:

                                ID            time GC time          memory allocations
  –––––––––––––––––––––––––––––––– ––––––––––––––– ––––––– ––––––––––––––– –––––––––––
         ["einsum_spmv_baremetal"]   4.333 ns (5%)                                    
     ["einsum_spmv_call_overhead"]  18.958 μs (5%)          17.16 KiB (1%)         321
  ["einsum_spmv_compile_overhead"] 245.632 ms (5%)         102.51 MiB (1%)     2417295
   ["permutedims(Dense(Dense()))"]   20.908 s (5%) 1.052 s  21.96 GiB (1%)       10854
  ["permutedims(Dense(Sparse()))"] 207.333 ms (5%)         380.13 MiB (1%)       21747

performance after:

                                ID            time GC time          memory allocations
  –––––––––––––––––––––––––––––––– ––––––––––––––– ––––––– ––––––––––––––– –––––––––––
         ["einsum_spmv_baremetal"]   4.333 ns (5%)                                    
     ["einsum_spmv_call_overhead"]  21.583 μs (5%)          17.56 KiB (1%)         329
  ["einsum_spmv_compile_overhead"] 246.651 ms (5%)         102.90 MiB (1%)     2429762
   ["permutedims(Dense(Dense()))"] 272.023 ms (5%)         762.94 MiB (1%)         115
  ["permutedims(Dense(Sparse()))"] 119.272 ms (5%)         191.27 MiB (1%)         255