Open alecjacobson opened 3 days ago
It seems like I can also get at this by nesting the types.
#include <TinyAD/ScalarFunction.hh>
#include <igl/matlab_format.h>
auto f(
const auto & x1, const auto & x2,
const auto & y1, const auto & y2, const auto & y3)
{
const auto r1 = x1-y1;
const auto r2 = x2-y2;
const auto d = r1*r1 + r2*r2;
return y3*d;
}
int main()
{
double x1 = 1;
double x2 = 2;
double y1 = 3;
double y2 = 4;
double y3 = 5;
{
using TAD = TinyAD::Scalar<5,double,true>;
Eigen::Matrix<TAD,5,1> x_ad = TAD::make_active({x1, x2, y1, y2, y3});
auto f_ad = f(x_ad[0], x_ad[1], x_ad[2], x_ad[3], x_ad[4]);
auto H = f_ad.Hess;
auto d2fdxdy = H.block<2,3>(0,2);
std::cout<<igl::matlab_format(H,"H")<<std::endl;
std::cout<<igl::matlab_format(d2fdxdy,"d2fdxdy")<<std::endl;
}
{
using inner = TinyAD::Scalar<3,double,false>;
using outer = TinyAD::Scalar<2,inner,false>;
Eigen::Matrix<inner,3,1> x_inner = inner::make_active({y1, y2, y3});
Eigen::Matrix<outer,2,1> x_outer = outer::make_active({x1, x2});
auto f_ad = f(x_outer[0], x_outer[1], x_inner[0], x_inner[1], x_inner[2]);
auto dfdx = f_ad.grad;
Eigen::Matrix<double,2,3> d2fdxdy;
for(int i = 0;i<2;i++)
{
for(int j = 0;j<3;j++)
{
d2fdxdy(i,j) = dfdx[i].grad[j];
}
}
std::cout<<igl::matlab_format(d2fdxdy,"d2fdxdy")<<std::endl;
}
}
mostly for my own memory: I did some quick timings and the nested approach seems to be consistently faster (2×-8x; it's computing 4× less after all) for moderate sizes (≈10).
Hmm. Nesting hits some compilation issues with binary operations. For example, if I change the above to:
return y3*(d-1.0);
then I get an error
invalid operands to binary expression ('const TinyAD::Scalar<2, TinyAD::Scalar<3, double, false>, false>' and 'double')
return y3*(d-1.0);
I can workaround this by changing the code to cast the literal:
return y3*(d-decltype(d)(1.0));
but I don't love it.
Hi Alec!
I had a look at this and it turned out to be relatively straightforward to restrict the Hessian computations to any sub-block.
The new insight for me was that the Hessian expressions only involve matrix-matrix addition/subtraction, scalar-matrix multiplication/division and vector-vector outer products. No matrix-vector or matrix-matrix multiplication! So nothing breaks when we just store a smaller (rectangular) Hessian in each TinyAD::Scalar.
In the branch hessian-block you can now write:
using TAD = TinyAD::Scalar<5, double, true, 0, 2, 2, 3>;
Eigen::Matrix<TAD, 5, 1> x_ad = TAD::make_active({ x1, x2, y1, y2, y3 });
auto f_ad = f(x_ad[0], x_ad[1], x_ad[2], x_ad[3], x_ad[4]);
f_ad.Hess
will now only be the Hessian block that starts at entry (0, 2) and has size (2, 3). (f_ad.grad
is still the full 5D gradient.)
I didn't have the chance to do any performance testing. Let me know if this is what you were looking for and if it beats the performance/convenience of the solution with nested types.
Nice! I like that this matches the Eigen Block format.
Thanks, again, for this great tool.
I'm interested in computing purely mixed second derivatives. For example, if I have a function:
f(x₁,x₂,y₁,y₂,y₃)
then I'm interested in the rectangular matrix of:
∂²f/∂xᵢ∂yⱼ
This is kind of thing can arise naturally in some physics problems. For example, if f is an elastic potential, x are deformed positions, and y are material parameters, then F=∂f/∂x is the force vector and ∂F/∂y describes how the force is affected by the material parameters.
Currently, I can get this by using
with_hessian=true
and compute the entire (|x|+|y|)² Hessian and then extract out the block I care about. I'm wondering: