Open schmidtjonathan opened 3 years ago
@JonathanWenger , @mmahsereci , @marvinpfoertner, @pnkraemer : FYI (as discussed)
Btw in case the info matters: lines such as A = A @ B @ A.T
and A = A @ B @ A.T + C
make up roughly a third of Kalman filters/smoothers and probabilistic ODE solvers, so resolving the issue is the only way of being able to use linear operators in diffeq
and filtsmooth
(and probably randprocs.markov
, which is the foundation for the other two).
Problem
When performing consecutive linear algebra computations on
probnum.linops.LinearOperator
s, in which the result is aLinearOperator
, which is then further processed inLinearOperator
-based arithmetics operations, the following problem arises: In the absence of pre-defined arithmetics forLinearOperator
s, which collapse certain kinds of operations to known types ofLinearOperator
, as for instanceKronecker @ Kronecker = Kronecker
orScaling[.inv()] {*, @, +, -} Scaling[.inv()] = Scaling
, most products/sums ofLinearOperator
objects are handled by fallback operators, calledProductLinearOperator
andSumLinearOperator
. These collect the factors/summands and - instead of evaluating the operations immediately - aggregate them and only evaluate them at the time, aLinearOperator
is applied to a numpy array.While this is desirable in some cases, in others (as for example in long time-series filtering/smoothing problems) it leads to an exponential growth of the Python call stack, holding an enormous amount of
LinearOperator
objects in memory.Example
This computation over 20 steps does not finish on my laptop within 2 minutes (after which I killed the process). The reason is the accumulation of
LinearOperator
objects inProductLinearOperator
s,SumLinearOperator
s, andTransposedLinearOperator
s, each of which hold long lists of objects.Solution
By implementing known arithmetical identities (as mentioned above, e.g.
Kronecker @ Kronecker = Kronecker
), we can avoid building large accumulations inProductLinearOperator
s,SumLinearOperator
s, andTransposedLinearOperator
s. This also applies to other kinds of fallback operators.In particular, in the example above, the product can always be written as a Kronecker structure, since
A @ B @ A.T = Kronecker @ Identity @ Kronecker = Kronecker @ Kronecker = Kronecker
. There is no need to build up aProductLinearOperator
, which makes the computation run within milliseconds.