patr-schm / TinyAD

Automatic Differentiation in Geometry Processing Made Simple
MIT License
371 stars 17 forks source link

Mixed second derivatives? #13

Open alecjacobson opened 3 days ago

alecjacobson commented 3 days ago

Thanks, again, for this great tool.

I'm interested in computing purely mixed second derivatives. For example, if I have a function:

f(x₁,x₂,y₁,y₂,y₃)

then I'm interested in the rectangular matrix of:

∂²f/∂xᵢ∂yⱼ

This is kind of thing can arise naturally in some physics problems. For example, if f is an elastic potential, x are deformed positions, and y are material parameters, then F=∂f/∂x is the force vector and ∂F/∂y describes how the force is affected by the material parameters.

Currently, I can get this by using with_hessian=true and compute the entire (|x|+|y|)² Hessian and then extract out the block I care about. I'm wondering:

  1. Is there currently a better way to get the mixed terms?
  2. If not, do you have recommendations on how best to change the library to support this?
alecjacobson commented 3 days ago

It seems like I can also get at this by nesting the types.

#include <TinyAD/ScalarFunction.hh>
#include <igl/matlab_format.h>

auto f(
  const auto & x1, const auto & x2, 
  const auto & y1, const auto & y2, const auto & y3)
{
  const auto r1 = x1-y1;
  const auto r2 = x2-y2;
  const auto d = r1*r1 + r2*r2;
  return y3*d;
}

int main()
{
  double x1 = 1;
  double x2 = 2;
  double y1 = 3;
  double y2 = 4;
  double y3 = 5;

  {
    using TAD = TinyAD::Scalar<5,double,true>;
    Eigen::Matrix<TAD,5,1> x_ad = TAD::make_active({x1, x2, y1, y2, y3});
    auto f_ad = f(x_ad[0], x_ad[1], x_ad[2], x_ad[3], x_ad[4]);
    auto H = f_ad.Hess;
    auto d2fdxdy = H.block<2,3>(0,2);
    std::cout<<igl::matlab_format(H,"H")<<std::endl;
    std::cout<<igl::matlab_format(d2fdxdy,"d2fdxdy")<<std::endl;
  }

  {
    using inner = TinyAD::Scalar<3,double,false>;
    using outer = TinyAD::Scalar<2,inner,false>;
    Eigen::Matrix<inner,3,1> x_inner = inner::make_active({y1, y2, y3});
    Eigen::Matrix<outer,2,1> x_outer = outer::make_active({x1, x2});
    auto f_ad = f(x_outer[0], x_outer[1], x_inner[0], x_inner[1], x_inner[2]);
    auto dfdx = f_ad.grad;
    Eigen::Matrix<double,2,3> d2fdxdy;
    for(int i = 0;i<2;i++)
    {
      for(int j = 0;j<3;j++)
      {
        d2fdxdy(i,j) = dfdx[i].grad[j];
      }
    }
    std::cout<<igl::matlab_format(d2fdxdy,"d2fdxdy")<<std::endl;
  }

}
alecjacobson commented 3 days ago

mostly for my own memory: I did some quick timings and the nested approach seems to be consistently faster (2×-8x; it's computing 4× less after all) for moderate sizes (≈10).

alecjacobson commented 3 days ago

Hmm. Nesting hits some compilation issues with binary operations. For example, if I change the above to:

  return y3*(d-1.0);

then I get an error

 invalid operands to binary expression ('const TinyAD::Scalar<2, TinyAD::Scalar<3, double, false>, false>' and 'double')
  return y3*(d-1.0);

I can workaround this by changing the code to cast the literal:

  return y3*(d-decltype(d)(1.0));

but I don't love it.

patr-schm commented 1 day ago

Hi Alec!

I had a look at this and it turned out to be relatively straightforward to restrict the Hessian computations to any sub-block.

The new insight for me was that the Hessian expressions only involve matrix-matrix addition/subtraction, scalar-matrix multiplication/division and vector-vector outer products. No matrix-vector or matrix-matrix multiplication! So nothing breaks when we just store a smaller (rectangular) Hessian in each TinyAD::Scalar.

In the branch hessian-block you can now write:

using TAD = TinyAD::Scalar<5, double, true, 0, 2, 2, 3>;
Eigen::Matrix<TAD, 5, 1> x_ad = TAD::make_active({ x1, x2, y1, y2, y3 });
auto f_ad = f(x_ad[0], x_ad[1], x_ad[2], x_ad[3], x_ad[4]);

f_ad.Hess will now only be the Hessian block that starts at entry (0, 2) and has size (2, 3). (f_ad.grad is still the full 5D gradient.)

I didn't have the chance to do any performance testing. Let me know if this is what you were looking for and if it beats the performance/convenience of the solution with nested types.

alecjacobson commented 3 hours ago

Nice! I like that this matches the Eigen Block format.