symforce-org / symforce

Fast symbolic computation, code generation, and nonlinear optimization for robotics
https://symforce.org
Apache License 2.0
1.44k stars 147 forks source link

Marginalization Utils #395

Open wxliu1 opened 4 months ago

wxliu1 commented 4 months ago

How to use symforce to generate marginalization prior factors? In ther words, how to use symforce to add marginalization factors to the optimizer?

aaron-skydio commented 4 months ago

The tools are all there, but we don't currently have them bundled together into a single function or set of functions to call for this. The process would with the current API would be something like:

  1. Collect the sym::Factors touching the variables you want to marginalize
  2. Use a sym::Linearizer to linearize those factors (you could do this manually also, but using the Linearizer is probably easier)
  3. Eliminate the marginalized variables in the resulting linearization. The sym::SchurComplementSolver doesn't quite expose what you'd need to do this, so for now you'd do this just with e.g. linear algebra ops from Eigen
  4. Create a new sym::Factor with the resulting linearization after elimination, and use that in the next optimization
empty-spacebar commented 4 months ago

The tools are all there, but we don't currently have them bundled together into a single function or set of functions to call for this. The process would with the current API would be something like:

  1. Collect the sym::Factors touching the variables you want to marginalize
  2. Use a sym::Linearizer to linearize those factors (you could do this manually also, but using the Linearizer is probably easier)
  3. Eliminate the marginalized variables in the resulting linearization. The sym::SchurComplementSolver doesn't quite expose what you'd need to do this, so for now you'd do this just with e.g. linear algebra ops from Eigen
  4. Create a new sym::Factor with the resulting linearization after elimination, and use that in the next optimization

Symforce doesn't have an official implementation of marginalization, is it because that the manipulation of matrix which is no longer sparse after being marginalized, can't get noticeable accelerated with symforce?

wxliu1 commented 4 months ago

Thanks!

aaron-skydio commented 4 months ago

Symforce doesn't have an official implementation of marginalization, is it because that the manipulation of matrix which is no longer sparse after being marginalized, can't get noticeable accelerated with symforce?

We do plan to add some utilities to make this nicer at some point, there are lots of reasons you’d want to do this with SymForce. Symforce supports both dense factors (most factors are dense by default) and full dense problems (with the DenseLinearizer). Even with dense problems, you benefit from common subexpression elimination, function flattening, and zero-overhead ops in SymForce, as described in sections IV.A and IV.C in the paper.

empty-spacebar commented 3 months ago

looking forward to symforce being the BEST Solver. Cheers

asa commented 3 months ago

The tools are all there, but we don't currently have them bundled together into a single function or set of functions to call for this. The process would with the current API would be something like:

  1. Collect the sym::Factors touching the variables you want to marginalize
  2. Use a sym::Linearizer to linearize those factors (you could do this manually also, but using the Linearizer is probably easier)
  3. Eliminate the marginalized variables in the resulting linearization. The sym::SchurComplementSolver doesn't quite expose what you'd need to do this, so for now you'd do this just with e.g. linear algebra ops from Eigen
  4. Create a new sym::Factor with the resulting linearization after elimination, and use that in the next optimization

Would it make sense to have a version of a Factor, that would take a Linearized{Dense|Sparse}Factor ( linearized_sparse_factor_t and linearized_dense_factor_t) and use it directly during the Linearize() call? From my read there isn't a type that can jump past the Linearize(values, linearized_factor, maybe_index_entry_cache) step for factor. I have a version working with the Hessian Functor type, that copies over the linearization elements, but seems like it would be a little more straightforward to just support this existing type directly?

I believe what you are proposing is that the linearization wouldn't ever change, this linearized factor is fixed at its elimination linearization.

Does that make sense?

asa commented 3 months ago

I believe this may be what Hayk is referring to here: https://github.com/symforce-org/symforce/blob/main/symforce/opt/factor.cc#L92 and https://github.com/symforce-org/symforce/blob/main/symforce/opt/factor.cc#L106

aaron-skydio commented 3 months ago

I believe this may be what Hayk is referring to here: https://github.com/symforce-org/symforce/blob/main/symforce/opt/factor.cc#L92 and https://github.com/symforce-org/symforce/blob/main/symforce/opt/factor.cc#L106

The comments there are just about function signature, and whether those functions should take four parameters like they do now, vs a struct as a single parameter

Would it make sense to have a version of a Factor, that would take a Linearized{Dense|Sparse}Factor ( linearized_sparse_factor_t and linearized_dense_factor_t) and use it directly during the Linearize() call? From my read there isn't a type that can jump past the Linearize(values, linearized_factor, maybe_index_entry_cache) step for factor. I have a version working with the Hessian Functor type, that copies over the linearization elements, but seems like it would be a little more straightforward to just support this existing type directly?

So, there needs to be something that computes at least the updated residual and rhs for the current Values, given the fixed linearization and the linearization point? Presumably that's what you've implemented as the thing you're handing to the existing constructor? We could add something that takes a Linearized{Dense | Sparse}Factor, and the linearization point, and returns a sym::Factor object though, that would be nice

asa commented 3 months ago

Yep, that is along the lines of what I was thinking. I'll see about adding what you describe in a generic way where the linearization point is a tangent vector member of the new factor. Could have std::vector and std::vector interface versions as well I think. I'll play with it.

asa commented 3 months ago

I have it boiled down to a hessian lambda like this:

auto gen_factor = [](const Eigen::VectorXd& linearization_point,
                       const std::vector<sym::Key>& linearized_keys,
                       const auto& linearized_factor) -> sym::Factord {
     return sym::Factord::Hessian(
         [linearization_point = std::move(linearization_point),
          // cache Jt, J and hess as they don't change
          jacobian_transpose = Eigen::MatrixXd(linearized_factor.jacobian.transpose()),
          jacobian = std::move(linearized_factor.jacobian),
          hessian = std::move(linearized_factor.hessian)](const sym::Pose3d& x,
                                                          Eigen::VectorXd* res,
                                                          Eigen::MatrixXd* jac,
                                                          Eigen::MatrixXd* hess,
                                                          Eigen::VectorXd* rhs) {
             // assume simple pose state. This needs to be improved to handle polymorphic keys / more complicated tangent_vec
             Eigen::VectorXd tangent_vec(6);
             tangent_vec.head(6) = x.ToTangent();

             // update residual based on current iteration state
             *res = jacobian * (tangent_vec - linearization_point);
             *rhs = jacobian_transpose * (*res);

             // Fixed by the linearization at which the linearized_factor was
             // marginalized
             *jac = jacobian;
             *hess = hessian;
         },
         // the same keys used in linearization are the keys to optimize
         linearized_keys);
};

However this doesn't have a general interface for pulling out the optimized keys and turning them into the tangent_vec for the internal loop of the factor. I could use some guidance on what you think the best path forward on getting that generic through the substantial machinery available in Factor. I started in on a constructor for sym::Factor that would produce this hessian_func, but I'll have to revisit again when I have some bandwidth. Feedback welcome!

aaron-skydio commented 3 months ago

Ah, you're going to want to use the raw Factor constructor that takes a HessianFunc, then you'll have access to the Values for the problem and the list of keys to use.

Some other minor math notes - I think you want to store the linearization point as a Values, and use LocalCoordinates instead of subtracting the tangent vectors (this is also approximate, but is accurate for small perturbations). I think you also want to store the residual at the initial linearization, and set *res = J * local_coords + residual;

asa commented 3 months ago

Ah, ok, got it. Thanks!

asa commented 3 months ago

Yes, values.LocalCoordinates(other) is what I was looking for. Nice!

asa commented 3 months ago

So, the plot thickens as it usually does when implementing this with the HessianFunc constructor. I have found something interesting: the sign of sym::Values::LocalCoordinates is opposite from the one you get directly on a Pose3(or other type).

eg if I take the LocalCoordinates of the first entry of my keys it is the opposite from that calculated by linearization_point_values.localCoordinates(current_optimizing_values).

setup like:

sym::Factord(
                 [linearization_point = std::move(linearization_point),
                  linearization_residual = std::move(linearized_factor.residual),
                  linearized_index,
                  // cache Jt, J and JtJ they don't change
                  jacobian_transpose = Eigen::MatrixXd(linearized_factor.jacobian.transpose()),
                  jacobian = std::move(linearized_factor.jacobian),
                  hessian = std::move(linearized_factor.hessian),
                  epsilon](const sym::Valuesd& state,                   
                           const std::vector<sym::index_entry_t>& keys, 
                           Eigen::VectorXd* res,
                           Eigen::MatrixXd* jac,
                           Eigen::MatrixXd* hess,
                           Eigen::VectorXd* rhs){
                           ...
                           },
                           linearized_keys);

and if then I calc local_coords internally in the factor like so:

auto local_coords = linearization_point.At<Pose3d>(keys.at(0)).LocalCoordinates(state.At<Pose3d>(keys.at(0)), epsilon);

I get a residual set of signs that converge.

vs

auto local_coords = linearization_point.LocalCoordinates(state, linearized_index, epsilon);

which diverges

Changing to

auto local_coords = state.LocalCoordinates(linearization_point, linearized_index, epsilon);

gets the system to converge again(residuals have the same sign as the pose.LocalCoordinates), but I have to modify the Values LocalCoordinates() function to be marked const. Looks like that func should likely be const at any rate, but it was strange to me that these signs were flipped relative to each other. I see the version of values: LocalCoordinatesHelper is (other, this) vs LiegroupOps::LocalCoordinates for Pose3 where it is (this,other). What should be the sign? Presumably this is on purpose, but I am not following.

asa commented 3 months ago

ah, I see that I have residual from the initial linearizer setup to incorporate. I misread that above!

aaron-skydio commented 3 months ago

the sign of sym::Values::LocalCoordinates is opposite from the one you get directly on a Pose3(or other type)

Good callout, that's pretty confusing. Issue created here: #399