kaskr / adcomp

AD computation with Template Model Builder (TMB)
Other
176 stars 80 forks source link

Added atomic matrix square root function #370

Closed lawlerem closed 1 year ago

lawlerem commented 1 year ago
bbolker commented 1 year ago

This is nice. It's interesting that there isn't already a Cholesky decomposition (which solves the same problem of a unique matrix square root, but with B triangular rather than symmetric ...)

kaskr commented 1 year ago

@lawlerem This is a nice first try. However, there are some important requirements for an atomic function before it can be considered in TMB:

  1. Cheap gradent principle must be retained (Time(reverse)<const*Time(forward) where const~=4). (NOT OK)
  2. Higher order AD shouldn't explode in memory. (NOT OK)
  3. Derivatives must be branching free. (OK)
  4. Ideally available to any order. (OK)

The fragment

matrix<Type> kronSum = kronecker(Yt,I)+kronecker(I,Y);
matrix<Type> kronSumInv = matinv(kronSum);

is a problem. Its computational complexity is O(n^6). This is too expensive compared with the forward pass O(n^3).

FWIW, there are tricks to efficiently calculate derivatives of analytic matrix functions - see how it's done for expm. I think similar techniques may be applied for sqrtm.

For now, I propose making your function available via TMB::install.contrib().

lawlerem commented 1 year ago

Thanks for the feedback. I'll make it available via TMB::install.contrib(). I will eventually look at expm and try to fix the problems with sqrtm, but might not get around to it any time soon. If anybody else wants to put in the work to fix it before I get around to it then please feel free.

bradbell commented 1 year ago

Not sure if this is a help, but here is a link to an extension from scalars to matrices of the AD theory for square roots: https://coin-or.github.io/CppAD/doc/cholesky_theory.htm

kaskr commented 1 year ago

Thanks @bradbell . I'm aware of this approach and did implement it in TMB some years ago: https://github.com/kaskr/adcomp/compare/master...rev_chol . I seem to recall that requirement '2' in my list above was hard to achieve, so I gave up on the idea.

lawlerem commented 1 year ago

I've added the function along with an example and a test here: https://github.com/lawlerem/adcomp_sqrtm. I don't think I'm allowed to edit the wiki to add it to the User Contributed C++ Code section, but here's the text for that section if you'd like to add it.

Atomic matrix square root. Install using

TMB:::install.contrib("https://github.com/lawlerem/adcomp_sqrtm/archive/master.zip")
kaskr commented 1 year ago

Looks good @lawlerem! I've updated the wiki.