SciML / ExponentialUtilities.jl

Fast and differentiable implementations of matrix exponentials, Krylov exponential matrix-vector multiplications ("expmv"), KIOPS, ExpoKit functions, and more. All your exponential needs in SciML form.
https://docs.sciml.ai/ExponentialUtilities/stable/
Other
93 stars 29 forks source link

`phi!` allocates #126

Open nathanaelbosch opened 1 year ago

nathanaelbosch commented 1 year ago

The docstring of phi! states: https://github.com/SciML/ExponentialUtilities.jl/blob/0cd77385923dc16ee9d781595a7529411f947f1c/src/phi.jl#L115-L119 Yet, it allocates:

using ExponentialUtilities, BenchmarkTools

d, q = 4, 3
L = rand(d, d)
out = ExponentialUtilities.phi(L, q);
caches = (zeros(d), zeros(d, q + 1), zeros(d + q, d + q))
expmethod = ExpMethodHigham2005()

@btime ExponentialUtilities.phi!($out, $L, $q; caches=$caches, expmethod=$expmethod);
# 21.998 μs (32 allocations: 10.00 KiB)

(and figuring out how to choose out and caches already required a bit of digging in the source code as this part is basically undocumented)

In comparison, non-allocating exponential! is very straight-forward:

cache = ExponentialUtilities.alloc_mem(L, expmethod);
@btime ExponentialUtilities.exponential!(copy($L), $expmethod, $cache) 
# 3.214 μs (2 allocations: 288 bytes) 

The reason

Going through profilers and some code, the issue seems to be the combination of this call here https://github.com/SciML/ExponentialUtilities.jl/blob/0cd77385923dc16ee9d781595a7529411f947f1c/src/phi.jl#L80 which then leads to some allocation here https://github.com/SciML/ExponentialUtilities.jl/blob/0cd77385923dc16ee9d781595a7529411f947f1c/src/exp_noalloc.jl#L80

Potential solution

Initialize this cache outside of the call to phi! and pass it as a function argument (e.g. expcache?). I tried this locally and such a change would remove most allocations:

expcache = ExponentialUtilities.alloc_mem(zeros(d + q, d + q), expmethod);
@btime ExponentialUtilities.phi!($out, $L, $q; caches=$caches, expmethod=$expmethod, expcache=$expcache,);
#  20.877 μs (4 allocations: 448 bytes)

(unfortunately it does not affect times a lot)

ChrisRackauckas commented 1 year ago

yeah it should pre-allocate that cache in cache. That's an oversight and should get a test.

nathanaelbosch commented 1 year ago

So what exactly is your suggestion to fix this issue? I'm more than happy to help here to have an (ideally easily accessible) allocation-free phi!.

From my point of view, I'd think that the preallocation for phi! in general could benefit from some improvements, e.g. the differences between caches and cache is not clear at all (the former is basically undocumented and there is no easy syntax to preallocate it right now), and maybe the output should also be more easily preallocateable.