stan-dev / math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
724 stars 183 forks source link

Matrix exponential #32

Closed syclik closed 6 years ago

syclik commented 9 years ago

From @chjackson on June 8, 2014 13:47

Feature request for a matrix exponential function, as raised recently on the stan-users list.

Copied from original issue: stan-dev/stan#680

syclik commented 9 years ago

from Chris Jackson on stan-users, 06/07/2014:

I think it'd be lovely to have matrix exponentials in Stan. I would use this for continuous-time Markov models. I know it's notoriously hard to calculate, but JAGS has a module for it (using Pade approximation) and the "expm" R package has a variety of algorithms, with a C API. Not sure about calculating the derivatives, I suspect it would be difficult if the matrix is not diagonalizable - though in the class of models I'd use it for, it generally would be.

syclik commented 9 years ago

from Michael Betancourt on stan-users, 06/07/2014:

The problem is indeed in the derivatives — unless the matrix is symmetric the derivatives of the matrix exponential are either ill-defined or very hard to compute. The first order derivatives have actually been computed for Riemannian HMC but they are a mess, and likely won’t be incorporated into a matrix_exp function anytime soon.

If you add this as a feature request at https://github.com/stan-dev/stan/issues then it won’t get lost in the mailing list, but no promises on when it might ever be implemented.

syclik commented 9 years ago

from Sean O'Riordain on stan-users, 06/07/2014:

In case the developers are not already aware of it... there is a nice paper on calculating matrix exponentials... "Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five Years Later∗", SIAM Review Vol 45, No.1, pp3-000

Seán

syclik commented 9 years ago

From @jpritikin on March 7, 2015 13:14

The most recent method is due to Higham,

http://epubs.siam.org/doi/abs/10.1137/090768539

It's already implemented in Eigen/unsupported

syclik commented 9 years ago

From @cdriveraus on May 11, 2015 16:51

I've been working on an R package for estimating continuous time dynamic models, it would be great to be able to include an option to estimate using STAN. As far as I can see the only thing missing is the matrix exponential...

syclik commented 9 years ago

From @wds15 on May 27, 2015 11:16

The matrix exponential would be extremely relevant for analytic solvable ODE systems when the solution is then exp(A*t). Here A is the coefficient matrix of the ODE system at hand. The catch is that the matrix must be evaluated repeatedly. The ODE systems which matter for PK systems fall into the class of "almost non-negative", i.e. only the diagonal entries are negative and all others are positive. These matrices apparently have some nice properties with respect to the solution of exp(At) which are documented here:

http://www.mscf.uky.edu/~qye/reports/expENM.pdf

I hope this helps for an implementation in Stan.

maedoc commented 8 years ago

Often it's the action of the exponential on a vector or which is desired, e.g. case of exponential schemes for ODEs, SDEs where the stepping scheme has the form exp(tA) v, [1] . Higham also has a recent paper on approximating this through a series of matrix-vector multiplications, which may be more friendly to autodiff than other approximations, manuscript is here and the author's MATLAB implementation here.

[1] evaluating the action directly is especially beneficial where A can be sparse, as is often the case for an ODE system Jacobian, and its exponential will be dense.

bob-carpenter commented 8 years ago

@maedoc Thanks for the ref. Looks like something we were discussion before on the mailing list. I don't know much about this, but if someone can get this working in the math library, it'll be easy to add to the language.

georgmartius commented 7 years ago

Hi, as mentioned above one use case is for ODEs where we have exp(t A) and we need the derivative w.r.t time. This is calculated by d/dt exp(t A) = exp(t A) A Since the matrix exponential is implemented in Eigen, as stated above, would it not be easy to implement this here? I am new to stan::math, so I would be glad about a pointer how to achieve this. (define a new function matexp(t,A) function with its gradient w.r.t. t).

bob-carpenter commented 7 years ago

It's an involved process:

https://github.com/stan-dev/stan/wiki/Contributing-New-Functions-to-Stan

Note that the matrix exponential is in Eigen-unsupported, not in the supported part of Eigen. We currently use the Eigen-unsupported FFT, so that's not a dealbreaker, just a warning that it will require more testing.

On Aug 10, 2016, at 1:54 PM, Georg Martius notifications@github.com wrote:

Hi, as mentioned above one use case is for ODEs where we have exp(t A) and we need the derivative w.r.t time. This is calculated by d/dt exp(t A) = exp(t A) A Since the matrix exponential is implemented in Eigen, as stated above, would it not be easy to implement this here? I am new to stan::math, so I would be glad about a pointer how to achieve this. (define a new function matexp(t,A) function with its gradient w.r.t. t).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

billgillespie commented 7 years ago

Charles Margossian and I (but mainly Charles) are already working on adapting the Eigen/unsupported code. Since the gradient w.r.t. the matrix A is pretty messy in the general case, the plan is to revise the code to autodiff the algorithm.

Bill

bob-carpenter commented 7 years ago

Neat. What do you need to do to revise the algorithm for autodiff? Someone was offering to pay someone a year or so ago to do this as part of a grant, but that offer has probably expired.

Several people have pointed us to this paper:

A NEW SCALING AND SQUARING ALGORITHM FOR THE MATRIX EXPONENTIAL∗ AWAD H. AL-MOHY† AND NICHOLAS J. HIGHAM† SIAM J. MATRIX ANAL. APPL.

I'm not sure if the Eigen implementation is related.

On Aug 11, 2016, at 2:46 PM, Bill Gillespie notifications@github.com wrote:

Charles Margossian and I (but mainly Charles) are already working on adapting the Eigen/unsupported code. Since the gradient w.r.t. the matrix A is pretty messy in the general case, the plan is to revise the code to autodiff the algorithm.

Bill

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

billgillespie commented 7 years ago

The Eigen implementation cites Nicholas J. Higham, "The scaling and squaring method for the matrix exponential revisited," SIAM J. Matrix Anal. Applic., 26:1179-1193, 2005, an earlier article by the same author. So the article you mentioned is related to the Eigen version but might reflect some improvements in the algorithm. We'll look into that...

Bill

cdriveraus commented 7 years ago

I posted this in a thread on stan-users a while ago, maybe it helps someone:

I've written a couple of matrix exponential variants within STAN. I guess it's slower than an ideal solution would be, but still works... here is the code for the functions. expmt is the taylor expansion approach (currently set to 50 iterations) and relies on the other posted functions, while expmp is self contained and uses pade approximants, it is based on the R function expm.Higham08. The pade approach is more efficient and stable, but I couldn't think of a lazy way to incorporate the necessary matrix / vector of values so just put it in as data using my wrapper function for continuous time models.

functions{

real factorial(int m); real factorial(int m) { if (m == 0) return 1; else return (m * factorial(m-1)); }

matrix matrix_pow(matrix a, real n); matrix matrix_pow(matrix a, real n) { if (n == 0){ return diag_matrix(rep_vector(1, rows(a))); } else return a * matrix_pow(a, n - 1); }

matrix expmt(matrix mat, real t){ matrix[rows(mat),rows(mat)] out; out<-diag_matrix(rep_vector(1.0, rows(mat))); for(i in 1:50){ out<-out+ matrix_pow(mat,i) * t^i /factorial(i); } return out; }

matrix expmp(matrix A, matrix padeC, vector padeCbig){ int n; real nA; real colsum; int l; matrix[4,10] C; vector[4] t; matrix[rows(A),rows(A)] I; matrix[rows(A),rows(A)] P; matrix[rows(A),rows(A)] U; matrix[rows(A),rows(A)] V; matrix[rows(A),rows(A)] X;

vector[14] Cbig; real s; real si; matrix[rows(A),rows(A)] B; matrix[rows(A),rows(A)] B2; matrix[rows(A),rows(A)] B4; matrix[rows(A),rows(A)] B6; matrix[rows(A),rows(A)] A2;

si <- 0; C <- padeC; Cbig <- padeCbig;

n <-rows(A); if(n != cols(A)) print("expmp: Matrix not square!")

if (n <= 1) X <- exp(A); else{

// nA <- Matrix::norm(A, "1") nA <- 0; for(coli in 1:n){ colsum<-0; for(rowi in 1:n){ colsum<-colsum+fabs(A[rowi,coli]); } if(colsum > nA) nA <- colsum; }

I <- diag_matrix(rep_vector(1,n)); if (nA <= 2.1) { t[1] <- 0.015; t[2]<- 0.25; t[3]<- 0.95; t[4]<- 2.1;

//l <- which.max(nA <= t) for(ti in 1:4){ if(l==0){ if(nA <= t[ti]) l <- ti; } }

A2 <- A * A; P <- I; U <- C[l, 2] * I; V <- C[l, 1] * I; for (k in 1:l) { P <- P * A2; U <- U + C[l, (2 * k) + 2] * P; V <- V + C[l, (2 * k) + 1] * P; } U <- A * U; X <- inverse(V - U) * (V + U); }

else { s <- log2(nA/5.4); B <- A; if (s > 0) { s <- ceil(s); B <- B/(2^s); }

B2 <- B * B; B4 <- B2 * B2; B6 <- B2B4; U <- B(B6(Cbig[14] * B6 + Cbig[12] * B4 + Cbig[10] * B2) + Cbig[8] * B6 + Cbig[6] * B4 + Cbig[4] * B2 + Cbig[2] * I); V <- B6(Cbig[13] * B6 + Cbig[11] * B4 + Cbig[9] * B2) + Cbig[7] * B6 + Cbig[5] * B4 + Cbig[3] * B2 + Cbig[1] * I; X <- inverse(V - U) * (V + U);

if (s > 0) { while (si < s){ si <- si + 1; X <- X * X; } } }

} return X; }

matrix kron_prod(matrix mata, matrix matb){ int m; int p; int n; int q; matrix[rows(mata)_rows(matb),cols(mata)cols(matb)] C; m<-rows(mata); p<-rows(matb); n<-cols(mata); q<-cols(matb); for (i in 1:m){ for (j in 1:n){ for (k in 1:p){ for (l in 1:q){ C[p(i-1)+k,q_(j-1)+l] <- mata[i,j]*matb[k,l]; } } } } return C; }

}

  data {

matrix[4,10] padeC; vector[14] padeCbig; }

and this goes in the stan data object:

padeC=rbind(c(120, 60, 12, 1, 0, 0, 0, 0, 0, 0), c(30240, 15120, 3360, 420, 30, 1, 0, 0, 0, 0), c(17297280, 8648640, 1995840, 277200, 25200, 1512, 56, 1, 0, 0), c(17643225600, 8821612800, 2075673600, 302702400, 30270240, 2162160, 110880, 3960, 90, 1)), padeCbig= c(64764752532480000, 32382376266240000, 7771770303897600, 1187353796428800, 129060195264000, 10559470521600, 670442572800, 33522128640, 1323241920, 40840800, 960960, 16380, 182, 1)

charlesm93 commented 7 years ago

@cdriveraus: Thank you for sharing your code! @bob-carpenter: should we move this thread to the dev-list?

Eigen-unsupported doesn't compute the action of the Matrix Exponential, so it looks like we'll need to supplement it with/ use another algorithm. Possibly the matlab code @maedoc referenced, or Expokit.

To revise the Eigen-unsupported algorithm for AD, my first attempt is to template as much of the code as possible. I'm trying to understand how to do this for functions in <cmath>. For one, I cannot find the source code on my local machine. A bit of research suggests this is because the function is implemented in the hardware... which is a new concept to me.

Looking at how Stan handles sin() , I'm guessing I need to write the gradient of <cmath> functions "by hand". The function causing me trouble is frexp(), and since one of its outputs is in an integer, the gradient is not well-defined for that output. I'm trying to figure out whether we need that gradient at all, or ways to skip this step gradient-wise, but in any case, it seems frexp() needs to be overloaded for AD variables.

bob-carpenter commented 7 years ago

On Aug 11, 2016, at 10:30 PM, Charles Margossian notifications@github.com wrote:

@cdriveraus: Thank you for sharing your code! @bob-carpenter: should we move this thread to the dev-list?

We're trying to have general discussions on the mailing lists, summarize designs on the Wiki, and then only move to issues when things are concrete enough to implement and then restrict discussions to implementations.

Eigen-unsupported doesn't compute the action of the Matrix Exponential, so it looks like we'll need to supplement it with/ use another algorithm. Possibly the matlab code @maedoc referenced, or Expokit.

To revise the Eigen-unsupported algorithm for AD, my first attempt is to template as much of the code as possible. I'm trying to understand how to do this for functions in . For one, I cannot find the source code on my local machine. A bit of research suggests this is because the function is implemented in the hardware... which is a new concept to me.

All of the cmath functions are coded with full templating in Stan with gradients directly. They're in stan-dev/math in directory stan/math/rev/scal/fun (with forward mode autodiff in path replacing "rev" with "fwd").

Looking at how Stan handles sin() , I'm guessing I need to write the gradient of functions "by hand".

Already done. See above.

The function causing me trouble is frexp(), and since one of its outputs is in an integer, the gradient is not defined.

That's why it's not in stan/math! There's no gradient and it breaks continuous differentiability in the general case. But if it's used in an algorithm, it's possible the entire algorithm's result is differentiable.

I'm trying to figure out whether we need that gradient at all, or ways to skip this step gradient-wise, but in any case, frexp() needs to be overloaded for AD variables.

That's easy to do---look at how the step functions like round() are implemented (in retrospect, we probably shouldn't have included those, either, because they're dangerous --- there's a big warning on all these non-continuously-differentiable functions in the manual).

charlesm93 commented 7 years ago

@bob-carpenter: Thank you for the guidance.

After writing frexp() for rev and fwd variables, I got a working prototype. I tried it on a few matrices and it returned the right output. I tested derivatives with diagonal matrices (I picked an easy case so that I could check the calculations by hand, but obviously more difficult tests will have to be passed) -- at least we know frexp doesn't prevent AD!

I'll write unit tests, and upload the code. The end result should be a function which, given a matrix A, returns exp(A). The function will call <unsupported/Eigen/MatrixFunctions> and a template version of computeUV (the original function is defined in MatrixFunctions). In addition, we'll need to add the rev and fwd version of frexp (in rev, fwd, and prim) to stan/math.

Once this is done, I'll focus on the action of the matrix exponential on a vector.

bob-carpenter commented 7 years ago

That sounds great. I've been told matrix exponentials are very numerically sensitive, so any help in the doc on where it'll work and where it won't will be appreciated. We'd rather throw an exception than return the wrong answer.

charlesm93 commented 7 years ago

There are examples in Moler & Van Loan, 2003 ("Nineteen Dubious Ways to Compute the Exponential of a Matrix, ...") designed to make non-robust algorithms fail.

The bigger concern is testing derivatives. I could compare with what other programs return (MatLab seems the place to go to).

I think I should be able to work out some cases by hand. My idea is to write down a diagonal matrix and then change the basis to make it non-diagonal. That way I can deal with a simple derivable form, while Stan is tasked with a more challenging and general form. Ok, sounds good theory, but we'll see how the implementation goes.

I'll also compare the derivatives from @cdriveraus 's code to those I obtain. Lastly, I'll try using matrix exponentials to solve the simple harmonic oscillator DE.

bob-carpenter commented 7 years ago

We typically evaluate derivatives by comparing to finite differences. External validation is a bonus.

If you can include some of the robustness tests from that paper, it'd be great. We can edit out the ones that don't pass and make them future to-do items.

charlesm93 commented 7 years ago

We typically evaluate derivatives by comparing to finite differences.

Right, makes sense.

I'm having the following issue with computeUV. The function is declared as a private member of the MatrixExponential class in eigen/unsupported. In the prototype I have, I forward declare computeUV for stan::math::var inside the class definition, and define computeUV in main.cpp. Trying to write the function in a header file causes compiling errors:

I see two non-ideal solutions: 1) Hack in the code of eigen/Unsupported and overload computeUV there. The function will not be in stan::math, and we would have to edit an "exterior" library. On the other hand, this would require the smallest amount of work. 2) Rewrite a new matrixExponential class, or rewrite the functions from matrixExponential, and include them in stan::math.

Is (1) acceptable? Is there are more elegant approach than (2)?

bob-carpenter commented 7 years ago

On Aug 15, 2016, at 11:01 PM, Charles Margossian notifications@github.com wrote:

...

I'm having the following issue with computeUV. The function is declared as a private member of the MatrixExponential class in eigen/unsupported. In the prototype I have, I forward declare computeUV for stan::math::var inside the class definition, and define computeUV in main.cpp.

You shouldn't edit the Eigen code directly. You need to put the var version inside stan::math in order for argument-dependent lookup (ADL) to work properly. See:

https://en.wikipedia.org/wiki/Argument-dependent_name_lookup

Trying to write the function in a header file causes compiling errors:

• variable has incomplete type

That comes from trying to use a class before it's fully defined. See, e.g.,

http://stackoverflow.com/questions/6349822/incomplete-type-in-class

• namespace math does not enclose namespace matrixExponential (this occurs when I define ComputeUV in stan::math).

Right, should be standalone.

I see two solutions: 1) Hack in the code of eigen/Unsupported and overload computeUV there. The function will not be in stan::math + we would have to edit an "exterior" library. On the other hand, this should be little work.

This won't work because of the ADL issue.

2) Rewrite a new matrixExponential class, or rewrite the functions in matrixExponential, and include them in stan::math. This is what's been done for Eigen::Matrix, right? Specifically, I am looking at the example of block.hpp which has been turned from a class function into a function that returns a matrix.

Don't write a new class, write two new functions in the the stan::math namespace:

stan/math/prim/mat/fun/matrix_exp.hpp:

double matrix_exp(const MatrixXd& x);

stan/math/rev/mat/fun/matrix_exp.hpp:

var matrix_exp(const Matrix<var, Dynamic, Dynamic>& x);

stan/math/fwd/mat/fun/matrix_exp.hpp:

template fvar matrix_exp(const Matrix<T, Dynamic, Dynamic>& x);

Just have them include and call whatever they need inside of Eigen.

charlesm93 commented 7 years ago

I should have mentioned that the functions I need to overload act on private members of the MatrixExponential class. As far as I know, I cannot access private members without hacking. Writing an alternate version of the class doesn't seem that easy either; a first attempt suggests I may have to rewrite certain files in supported Eigen. But this is still work in progress.

Has there been a precedent for such a situation in Stan?

syclik commented 7 years ago

Early on, we used to patch Boost with some bug fixes. But we have moved away from that years ago. It can be done, but in most cases, it's a poor design choice or there's a different way to implement it.

Do you have access to push to branches directly? If so, could you put the work on a branch and I can take a look to see if that's really necessary. If not, I can give you permissions.

On Wed, Aug 17, 2016 at 3:53 PM, Charles Margossian < notifications@github.com> wrote:

I should have mentioned that the functions I need to overload act on private members of the MatrixExponential class. As far as I know, I cannot access private members without hacking. Writing an alternate version of the class doesn't seem that easy either; a first attempt suggests I may have to rewrite certain files in supported Eigen. But this is still work in progress.

Has there been a precedent for such a situation in Stan?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/math/issues/32#issuecomment-240527297, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZ_FzRrzjLjd91YXgKCyi-Gja6Q40K4ks5qg2asgaJpZM4FTDl2 .

bob-carpenter commented 7 years ago

I'm still not sure what you mean by "act on private members". Do you mean private member variables (non-static variables declared in the class) or non-static class functions? If it's member variables, are they things you pass into the class to start with or things that are easy to recreate?

Private parts of classes aren't part of the interface and aren't guaranteed to be stable in future releases---that's why we don't like to hack them.

I'm also not sure what you want to overload here. The only function that will be overloaded in the Stan math API are multiple versions of matrix_exp() for double, const var&, and a templated version for const fvar&.

It's usually much easier to look at code than to guess.

charlesm93 commented 7 years ago

@syclik: Yes, I have permission to push. GitHub gymnastic is a little new to me, so I had to set up a few things (following the guidelines here: https://github.com/stan-dev/stan/wiki/Dev:-Git-Process). Sorry for the delay is responding!

Let me start by sharing the non-ideal working prototype, where the exp function works for rev AD (this is easily extendable to fwd AD), and where I hacked into Eigen/unsupported.

Created branch: feature/issue-32-matrix-exponential

Changes from dev:

    modified:   lib/eigen_3.2.8/unsupported/Eigen/src/MatrixFunctions/MatrixExponential.h
    modified:   make/libraries
    modified:   makefile
    modified:   stan/math/rev/mat.hpp
    new file:   stan/math/rev/mat/fun/matrix_exp.hpp
    modified:   stan/math/rev/scal.hpp
    new file:   stan/math/rev/scal/fun/frexp.hpp

When I tried to run A.exp(), where A is a matrix of var, there were two functions that did not recognize stan variables: frexp and computeUV. These are the two functions I overloaded. Frexp was from cmath, so this was easy to do without changing Eigen/unsupported. ComputeUV was a class function from MatrixExponential, so I changed it in the class. Subsequently, exp() worked on matrices of var.

I'm still not sure what you mean by "act on private members". Do you mean private member variables (non-static variables declared in the class) or non-static class functions? If it's member variables, are they things you pass into the class to start with or things that are easy to recreate?

I mean private member variables. They are passed (in an initialization list) into the class, but they are easy to recreate, seeing they correspond to various properties of the input matrix. My plan is to write a function that creates these variables, and then applies various function members of the MatrixExponential class. Which is almost the same as rewriting the class, and should do the job. What I was looking for was a more elegant solution that took advantage of the MatrixExponential class.

Here's the C++ code I used to do some quick checks on the prototype I posted. In this example, the exp function does not have the intended format, namely exp(A), rather A.exp().

#include <unsupported/Eigen/MatrixFunctions>
#include <stan/math.hpp>

int main() {

    // Look at simple of a 1 x 1 matrix
    stan::math::var mu = 2.0, lp;
    Eigen::Matrix<stan::math::var, 1, 1> mu_matrix;
    Eigen::Matrix<stan::math::var, 1, 1> mu_exp;

    mu_matrix << 2*mu;
    mu_exp = mu_matrix.exp();

    lp = mu_exp(0);
    std::cout << "mu = " << mu.val() << "\n";
    std::cout << "exp(2*mu) = " << lp.val() << "\n";
    lp.grad();
    std::cout << "d.f / d.mu = " << mu.adj() << "\n\n";

    // Expected result: d.f / d.mu = 2*exp(4) ~ 110

    std::cout << "-------------------------------------------- \n\n";

    // Look at non-diaginal case (example from Moler & Van Loan, 2003)
    Eigen::Matrix<stan::math::var, 2, 2> E, E_diag, E_b, E_b_1;
    stan::math::var E11=-1, E22=-17, lp11;

    E_diag << E11, 0,
              0, E22;

    E_b << 1, 3, 
           2, 4;

    E_b_1 << -2, 1.5,
             1, -.5;

    E = E_b*E_diag*E_b_1;

    std::cout << "The Matrix E is:\n" << E << "\n\n"
              << "The matrix exponential of E is:\n" << E.exp() << "\n\n";

    Eigen::Matrix<stan::math::var, 2, 2> expE =  E.exp();
    lp11 = expE(1,1);
    lp11.grad();
    std::cout << "d.exp(E)[2,2]/dx = " << E11.adj() << "\n\n";
    // Expected Result (worked out by hand) for d.exp(E)/dx:
    //
    //      dE/dx = -.736   .552
    //              -1.472  1.104
    //
    // Note: derivative is identical to E. 

    return 0;

}
bgoodri commented 7 years ago

Giles (section 2.3.5) asserts that the fwd and rev mode autodiff of the matrix exponential (via scaling and squaring) can be constructed from what he already demonstrated

http://eprints.maths.ox.ac.uk/1079/1/NA-08-01.pdf

Actually doing so is left as an exercise for the reader.

charlesm93 commented 7 years ago

@bgoodri Thank you, I'll take a look into it!

I actually just finished implementing a working prototype with eigen 3.2.9, forking code from 3.3-beta2, and I will post it on gitHub after some polishing and finishing the unit tests. Subject to revision after reading Giles.

bgoodri commented 7 years ago

Theano also has an implementation of the derivatives of the matrix exponential https://github.com/Theano/Theano/blob/master/theano/tensor/slinalg.py#L458

bgoodri commented 7 years ago

I think Theano's method is better explained (among others) in http://dsp.vscht.cz/konference_matlab/MATLAB08/prispevky/017_brancik.pdf

bgoodri commented 7 years ago

Can you include a specialization for the expm of a 2x2 matrix?

charlesm93 commented 7 years ago

@bgoodri Thanks for the articles. Let me try the scaling and squaring method (2.3.5 in Giles and 7 in Brancik).

What would the specialization of a 2x2 matrix entail? The 2x2 case seems simple enough to be efficiently handled by the general case.

bgoodri commented 7 years ago

The end of http://mathworld.wolfram.com/MatrixExponential.html has a general formula for the 2x2 case involving trig functions. It seems like that would be faster than going through the scaling and squaring.

charlesm93 commented 7 years ago

Nice!

There are a few matrix properties we could take advantage of. matrix_exp could check for these properties and then pick an algorithm. For example, a symmetric matrix can have its matrix exponential constructed using eigen decomposition.

I think we could make matrix_exp pay attention to the following properties:

(diagonal, 1x1, and zero seem trivial enough to be handled by the user).

Should these features be included in matrix_exp, or should the user call a separate function, such as matrix_exp_2x2 ? I prefer the former approach (less functions, seems more elegant, and I'm not sure about how helpful the added flexibility of multiple functions would be). I do wonder how much checking for properties would slow down the code.

bgoodri commented 7 years ago

I would say the user should call expm() and expm() should branch whenever it makes sense. Even if we had expm_symmetric() and whatnot exposed, Stan conventions would dictate that we check symmetry inside expm_symmetric(), so that wouldn't save any time relative to checking for symmetry within expm(). Of course, if someone knows that they have a symmetric matrix, they can do the eigendecomposition approach themselves.

On Thu, Sep 1, 2016 at 12:17 PM, Charles Margossian < notifications@github.com> wrote:

Nice!

There are a few matrix properties we could exploit. matrix_exp could check for these properties and then pick an algorithm. For example, a symmetric matrix can have its matrix exponential efficiently constructed using eigen decomposition.

I think we could make matrix_exp pay attention to the following properties:

  • 2x2
  • symmetric
  • nilpotent
  • more I am not thinking of right now

(diagonal, 1x1, and zero seem trivial enough to be handled by the user).

Should these features be included in matrix_exp, or should the user call a separate function, such as matrix_exp_2x2 ? I prefer the former approach (less functions, seems more elegant, and I'm not sure about how helpful the added flexibility of multiple functions would be). I do wonder how much checking for properties would slow down the code.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/math/issues/32#issuecomment-244131681, or mute the thread https://github.com/notifications/unsubscribe-auth/ADOrqnEAea7y7aORiy3USb0JJCeNTnozks5qlvqggaJpZM4FTDl2 .

bob-carpenter commented 7 years ago

On Sep 1, 2016, at 9:04 PM, bgoodri notifications@github.com wrote:

I would say the user should call expm() and expm() should branch whenever it makes sense. Even if we had expm_symmetric() and whatnot exposed, Stan conventions would dictate that we check symmetry inside expm_symmetric(), so that wouldn't save any time relative to checking for symmetry within expm().

That's a very good point. But checking for symmetry would slow down the usual matrix exponential calc, but probably not much as it's only a quadratic operation.

2 x 2 is a no-brainer as that's a simple constant-time check for size.

Of course, if someone knows that they have a symmetric matrix, they can do the eigendecomposition approach themselves.

Only if they're as good as Ben at linear algebra.

charlesm93 commented 7 years ago

The end of http://mathworld.wolfram.com/MatrixExponential.html has a general formula for the 2x2 case involving trig functions. It seems like that would be faster than going through the scaling and squaring.

@bgoodri The algorithm from wolfram doesn't work for all 2x2 matrices. One other condition needs to be met, because of the delta term.

For A(2, 2) << a, b, c, d , we need (a - d)^2 + 4_b_c > 0.

I adjusted the matrix_exp function to use the 2x2 algorithm if the matrix has the right dimensions AND the above condition is met.

syclik commented 6 years ago

@charlesm93 implemented the matrix_exp function! Closing this issue.

yizhang-yiz commented 6 years ago

@bob-carpenter mentioned Al-Mohy's paper. Is it on someone's todo list? If not I'm gonna assign it to myself.

yizhang-yiz commented 6 years ago

Hi @charlesm93 @syclik, the matrix_exp function passes matrix A by value, not by reference. Is this intended?

syclik commented 6 years ago

Can you link the line to the file?

syclik commented 6 years ago

Wow, that was cryptic. Can you either send the file name + line number? Or find it directly in GitHub online, click the line, then send that link?

yizhang-yiz commented 6 years ago

Sorry. I was referring this https://github.com/stan-dev/math/blob/ad01b2cc20f1158eac34d875f7253a42a3fab940/stan/math/prim/mat/fun/matrix_exp.hpp#L21

syclik commented 6 years ago

Thanks. No, I don't think that was intentional. Mind creating a new issue for it?

On Feb 16, 2018, at 6:11 PM, Yi Zhang notifications@github.com wrote:

Sorry. I was referring this https://github.com/stan-dev/math/blob/ad01b2cc20f1158eac34d875f7253a42a3fab940/stan/math/prim/mat/fun/matrix_exp.hpp#L21

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

bob-carpenter commented 6 years ago

I think it's a mistake, as the matrix is only ever passed as const reference.

I recall seeing a pass-by-value that was intentional for a copy, but I think we should avoid doing that and I don't think that's what's going on here.

charlesm93 commented 6 years ago

The not passing by reference looks like a mistake. I can create an issue and a fix.

charlesm93 commented 6 years ago

Issue #769 and pull request #770 . @yizhang-cae There are several ways to optimize the current implementation of the matrix exponential. I can discuss here or create a new issue (on optimizing matrix exp function -- or is that not specific enough)?

yizhang-yiz commented 6 years ago

Let's discuss in the new issue thread. Thanks.

Regards, Yi Zhang metrumrg.com

On Sat, Feb 17, 2018 at 4:38 PM, Charles Margossian < notifications@github.com> wrote:

Issue #769 https://github.com/stan-dev/math/issues/769 and pull request

770 https://github.com/stan-dev/math/pull/770 .

@yizhang-cae https://github.com/yizhang-cae There are several ways to optimize the current implementation of the matrix exponential. I can discuss here or create a new issue (on optimizing matrix exp function -- or is that not specific enough)?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stan-dev/math/issues/32#issuecomment-366473337, or mute the thread https://github.com/notifications/unsubscribe-auth/AfkhsEAamUyl8HIGGqOzfcxCb_8_Ip-Wks5tV0a4gaJpZM4FTDl2 .