Closed Mechazo11 closed 5 months ago
Hi @Mechazo11,
thanks for using least squares cpp. Regarding your first question, there is a setObjective
method in the LM class which allows to initialize / change the state of your objective. Note that this performs a copy of the objective object, so if the data in the objective is large it might be a good idea to store it as pointer or shared pointer.
Regarding your other points I will give you a detailed answer the next days when I get some time to do some reading ;)
Regarding explicit jacobian computation (e. g. analytically) there is an example on how to do this in the examples directory. See https://github.com/Rookfighter/least-squares-cpp/blob/master/examples%2Fprovide_explicit_jacobian.cpp
I will provide you with an updated version of your example code later :+1:
@Rookfighter thank you very much for your prompt reply. I really appreciate it. At the moment, I am trying to the setObjective
method with numerical jacobian first and will then attempt to use analytical jacobian based on your updated example. Maybe when I can verify my optimization answers are correct I can share both approaches with an example test code to bolster this libraries example repository.
Hello @Rookfighter,
Apologizing in advance for this lengthy post.
Regarding my first problem, I was able to use a global pointer extern Eigen::MatrixXd* global_H_mat_ptr;
to transport the H
matrix inside the functor without requiring to initiate an object outside of your library. Eventually I was able to get the optimization going with numerical jacobian
.
However the results don't look good and thus I am anxious to see how they would turn out with the analytical jacobian.
However, at the time of writing post, I am unable to get the analytical jacobian to work. Below is a code that hopefully will enable you to replicate the issue
#include <lsqcpp/lsqcpp.hpp>
#include <Eigen/Dense>
extern Eigen::MatrixXd* global_H_mat_ptr;
struct FastMFFunctorAnalyticalJacobian
{
static constexpr bool ComputesJacobian = true; // set true when Analytic equation for jacobian exsists
template<typename _Scalar>
void compute_V2r(const _Scalar& x, const _Scalar& y, const _Scalar& z, Eigen::Matrix<_Scalar, Eigen::Dynamic, 1>& V2r) const {
// For one row of Rstar, builds its corresponding V_2(r) vector based on Equation 11
// Const correctness will stop if we try to modify the "member" variables, not reference variables
// Returns None
V2r = Eigen::Matrix<_Scalar, Eigen::Dynamic, 1>::Zero(6); // Initialize
V2r[0] = pow(x,2); // x^2
V2r[1] = pow(y,2); // y^2
V2r[2] = pow(z,2); // z^2
V2r[3] = x * y; // xy
V2r[4] = x * z; // xz
V2r[5] = y * z; // yz
}
template<typename _Scalar>
void compute_Jv2r(const _Scalar& x, const _Scalar& y, const _Scalar& z, Eigen::Matrix<_Scalar, Eigen::Dynamic, Eigen::Dynamic>& J_mat) const {
// Computes d.V2r/dr using Equation 33
// NOTE, J_mat must be a all zero matrix
J_mat = Eigen::Matrix<_Scalar, Eigen::Dynamic, Eigen::Dynamic>::Zero(6,3); // Initialize
// Update only the non-zero terms
J_mat(0,0) = 2 * x;
J_mat(1,1) = 2 * y;
J_mat(2,2) = 2 * z;
J_mat(3,0) = y;
J_mat(3,1) = x;
J_mat(4,0) = z;
J_mat(4,2) = x;
J_mat(5,1) = z;
J_mat(5,2) = y;
}
template<typename _Scalar>
void compute_jacobian_perterb_rotation(const _Scalar& x, const _Scalar& y, const _Scalar& z, Eigen::Matrix<_Scalar, Eigen::Dynamic, Eigen::Dynamic>& r_mat) const {
// Computes per axis the Jacobian with respec to delta.psi using Equation 34
// Returns None
r_mat = Eigen::Matrix<_Scalar, Eigen::Dynamic, Eigen::Dynamic>::Zero(3,3); // 3x3 perturbation rotation matrix
// Update non-zero terms only
r_mat(0,1) = z;
r_mat(0,2) = -y;
r_mat(1,0) = -z;
r_mat(1,2) = x;
r_mat(2,0) = y;
r_mat(2,1) = -x;
}
template<typename Scalar, int Inputs, int Outputs>
void operator()(const Eigen::Matrix<Scalar, Inputs, 1> &xval,
Eigen::Matrix<Scalar, Outputs, 1> &fval,
Eigen::Matrix<Scalar, Eigen::Dynamic, Eigen::Dynamic> &jacobian) const {
// Compute error term and jacobian
assert(global_H_mat_ptr != nullptr); // Ensure the global pointer is set and not a nullptr
Eigen::Matrix<Scalar, Eigen::Dynamic, Eigen::Dynamic>& H = *global_H_mat_ptr; // Deferencing a pointer
// Assert 9 decision variables were passsed, maybe redundant
assert(xval.size() == 9);
fval.resize(1); // Without this resize(), Eigen throws Assertion "index >= 0 && index < size()' failed"
// Initialize work variables
Eigen::Matrix<Scalar, Eigen::Dynamic, 1> V2r1; // V_2(r1) -- X axis
Eigen::Matrix<Scalar, Eigen::Dynamic, 1> V2r2; // V_2(r1) -- Y axis
Eigen::Matrix<Scalar, Eigen::Dynamic, 1> V2r3; // V_2(r3) -- Z axis
Eigen::Matrix<Scalar, Eigen::Dynamic, 1> f_R(18); // f(R) = [H.V2r1; H.V2r2; H.V2r3]
// Const cast decision variables for use
// TODO these maybe redundant, comeback to these later
const Scalar r11 = xval(0);
const Scalar r12 = xval(1);
const Scalar r13 = xval(2);
const Scalar r21 = xval(3);
const Scalar r22 = xval(4);
const Scalar r23 = xval(5);
const Scalar r31 = xval(6);
const Scalar r32 = xval(7);
const Scalar r33 = xval(8);
// Compute three V2r vectors based on current rotation matrix R
compute_V2r(r11, r12, r13, V2r1);
compute_V2r(r21, r22, r23, V2r2);
compute_V2r(r31, r32, r33, V2r3);
f_R << H * V2r1, H * V2r2, H * V2r3; // 18 x 1, concatenate three 6x1 vectors, Equation 32
fval(0) = f_R.transpose() * f_R;
// Analytically compute jacobian
jacobian.setZero(18, 3); // Initialize
// Variables to hold Jacobian of per axis vectors i.e. skew-symmetric rotation matrices
Eigen::Matrix<Scalar, Eigen::Dynamic, Eigen::Dynamic> J_v1;
Eigen::Matrix<Scalar, Eigen::Dynamic, Eigen::Dynamic> J_v2;
Eigen::Matrix<Scalar, Eigen::Dynamic, Eigen::Dynamic> J_v3;
// Variables to hold rotation perturbation matrix corresponding to delta.psi
Eigen::Matrix<Scalar, Eigen::Dynamic, Eigen::Dynamic> r1_sym_x;
Eigen::Matrix<Scalar, Eigen::Dynamic, Eigen::Dynamic> r2_sym_x;
Eigen::Matrix<Scalar, Eigen::Dynamic, Eigen::Dynamic> r3_sym_x;
// Equaiton 33
compute_Jv2r(r11,r12,r13, J_v1);
compute_Jv2r(r21,r22,r23, J_v2);
compute_Jv2r(r31,r32,r33, J_v3);
// Equation 34
compute_jacobian_perterb_rotation(r11,r12,r13,r1_sym_x);
compute_jacobian_perterb_rotation(r21,r22,r23,r2_sym_x);
compute_jacobian_perterb_rotation(r31,r32,r33,r3_sym_x);
// Equation 35
jacobian << H * J_v1 * r1_sym_x,
H * J_v2 * r2_sym_x,
H * J_v3 * r3_sym_x;
// Each H*J_v2_(r)*[r_1]x --> [6x6]*[6x3]*[3x3] --> [6x3].
// Jacobian works out to be a [18x3] matrix
jacobian = -1 * jacobian; // There was a -1 infront of the whole matrix. See Equation 35
}
};
int main(){
Eigen::MatrixXd* global_H_mat_ptr = nullptr;
Eigen::MatrixXd H_mat(6, 6);
global_H_mat_ptr = &H_mat; // Set the global pointer to H_mat
H_mat << 1.82574e-06, 1.82574e-06, 1.82574e-06, -1.20712e-21, 1.7249e-21, -1.48388e-21,
0.411329, -0.410738, -0.000590934, 0.0341389, -0.44906, 0.477684,
-0.349149, -0.314978, 0.664127, 0.713277, -0.246387, -0.251963,
0.0660712, 0.198444, -0.264515, 1.12855, 1.10806, 1.07442,
-1.44781, 0.0178776, 1.42994, -1.37306, 0.262784, 1.60901,
-0.980976, 1.77531, -0.794333, 0.666025, -1.76797, 0.660609;
lsqcpp::LevenbergMarquardtX<double, FastMFFunctorAnalyticalJacobian> optimizer;
optimizer.setMinimumGradientLength(1e-6);
optimizer.setMinimumStepLength(1e-6);
optimizer.setMinimumError(0);
optimizer.setMethodParameters({1.0, 2.0, 0.5, 100});
optimizer.setVerbosity(4);
Eigen::VectorXd initialGuess(9);
initialGuess << 1, 0, 0, 0, 1, 0, 0, 0, 1; // Initial guess of the rotation frame flattened as a 1D array
auto result = optimizer.minimize(initialGuess); // Here I get the error shown below
}//EOF main
Error message shown
fast_mf_node: /usr/include/eigen3/Eigen/src/Core/Product.h:96: Eigen::Product<Lhs, Rhs, Option>::Product(const Lhs&, const Rhs&) [with _Lhs = Eigen::Transpose<Eigen::Matrix<double, -1, -1> >; _Rhs = Eigen::Matrix<double, -1, 1>; int Option = 0; Lhs = Eigen::Transpose<Eigen::Matrix<double, -1, -1> >; Rhs = Eigen::Matrix<double, -1, 1>]:
Assertion `lhs.cols() == rhs.rows() && "invalid matrix product" && "if you wanted a coeff-wise or a dot product use the respective explicit functions"' failed.
[ros2run]: Aborted
If you would kindly help me troubleshoot this problem, I will appreciate it greatly.
On a different, note, if you notice, I required a number of functions apart from the operator()()
functor, but the way I have had to define the templates, it doesn't seem right to me, even though it works. If you have any suggestion on how to better manage template style struct(class) writing, please share.
Hi @Mechazo11,
thanks for the elaborate code sample.
Let's get the templates sorted first. least-squares-cpp is designed to supported any form of Eigen::Matrix dimension. This means, it can handle statically or dynamically sized matrices. If you want to design your objective functor in a way that accepts any of these two variants then using templates (as in my examples) is the preferred way.
Now, in your case (and many common cases) you can simplify the objective can be simplified by just using dynamic size, because you use the LevenbergMarquardtX
solver class with doubles. Basically this means you can change the signature of your objective as follows (and get rid of any templates):
struct FastMFFunctorAnalyticalJacobian
{
static constexpr bool ComputesJacobian = true;
void operator()(const Eigen::VectorXd &xval,
Eigen::VectorXd &fval,
Eigen::MatrixXd &jacobian) const
{
// your stuff goes here
}
};
Note that this way your functor only supports double
scalars and matrices / vectors with dynamic size.
The next things, which just seems like a code smell to me is the global variable. You can move the H
matrix into the objective function as follows:
#include <memory>
struct FastMFFunctorAnalyticalJacobian
{
std::shared_ptr<Eigen::MatrixXd> H_ptr;
// all other methods as needed ...
};
int main(){
auto H_ptr = std::make_shared<Eigen::MatrixXd>(6, 6);
(*H_ptr) << 1.82574e-06, 1.82574e-06, 1.82574e-06, -1.20712e-21, 1.7249e-21, -1.48388e-21,
0.411329, -0.410738, -0.000590934, 0.0341389, -0.44906, 0.477684,
-0.349149, -0.314978, 0.664127, 0.713277, -0.246387, -0.251963,
0.0660712, 0.198444, -0.264515, 1.12855, 1.10806, 1.07442,
-1.44781, 0.0178776, 1.42994, -1.37306, 0.262784, 1.60901,
-0.980976, 1.77531, -0.794333, 0.666025, -1.76797, 0.660609;
lsqcpp::LevenbergMarquardtX<double, FastMFFunctorAnalyticalJacobian> optimizer;
FastMFFunctorAnalyticalJacobian objective;
objective.H_ptr = H_ptr;
optimizer.setObjective(objective);
// the rest as in your sample code
}
You can then use H_ptr
in your objective as before, but you eliminated the global by using the setObjective
method.
@Rookfighter thank you very much for clarifying the first two problems. Really appreciate it,
I will definitely move my implementation to setObjective
once I figure out the remaining computations. Have you had a chance to see why my analytical Jacobian is throwing an error?
@Mechazo11
Now let's get to the big point: numerical instability. I suggest to start with ComputesJacobian = false;
in your objective, to eliminate errors arising from analytical jacobian computation (as these are very error prone in the beginning).
There are two problems that might be present (by looking at your code snippet).
First point:
fval
is usually supposed to be the residuals vector. However, it seems that you store the already computed squared error into fval
. I haven't read the paper, but it looks like your variable f_R
already contains the residuals vector. So, I would suggest to do the following
// copy residual vector into output vector for least squares cpp
fval = f_R;
If you omit the analytical computation of the jacobian and your residuals are computed correctly, the solver should be able to converge against a meaningful value.
Second point: It seems to me that your flattened state vector represents a rotation matrix. This can be problematic, because rotations (SO3) only have 3 degrees of freedom, but you represent them by using 9 (3x3 matrix) state variables. This leads to numeric instability, bad convergence and ambiguity in the solution space. Also the optimizer has no knowledge that the parameters are supposed to represent a rotation matrix, so there is no guarantee that the resulting matrix is a orthonormal matrix (as rotation matrices are supposed to be). Typically rotations are encoded as lower dimensional vectors (please refrain from using euler angles due to gimbal lock). A good and robust encoding are Rodriguez Vectors (see https://en.wikipedia.org/wiki/Rodrigues%27_rotation_formula) least squares cpp already provides encode and decode functions for Rodriguez vectors here: https://github.com/Rookfighter/least-squares-cpp/blob/master/include/lsqcpp/lsqcpp.hpp#L1765 https://github.com/Rookfighter/least-squares-cpp/blob/master/include/lsqcpp/lsqcpp.hpp#L1792
Basically you would use it like so:
struct FastMFFunctorAnalyticalJacobian
{
void operator()(const Eigen::VectorXd &xval,
Eigen::VectorXd &fval,
Eigen::MatrixXd &jacobian) const
{
// decode rodrigues into 3x3 rotation matrix
auto rotmat = lsqcpp::parameter::decodeRotation(xval);
// your stuff goes here
}
};
int main(){
Eigen::VectorXd initial_guess = lsqcpp::parameter::encodeRotation(Eigen::Matrix3d::Identity);
// your stuff goes here
}
One disclaimer: I haven't compiled the code here, but I think the idea should get you going :) Looking forward to your next results
And one more note: If you want me to dive a bit deeper into your code, feel free to put your sample code into a public github repository with compileable code and I could actually analyze its run-time behavior with you.
@Rookfighter thank you very much for clarifying the first two problems. Really appreciate it,
I will definitely move my implementation to
setObjective
once I figure out the remaining computations. Have you had a chance to see why my analytical Jacobian is throwing an error?
@Mechazo11
Yes, your jacobian has the wrong dimensions.
The relations between xval
, fval
and jacobian
are as follows:
If fval
is a Mx1 vector and xval
is a Nx1 vector then the jacobian
has to be a MxN matrix.
In your case fval
is a 1x1 vector and xval
is a 9x1, but your jacobian
is a 18x3 matrix, which leads to the error.
When you change the code as I suggested fval = f_R;
then fval
will be a 18x1 vector, so the M dimension is correct. I am not sure why the paper though makes you have the dimension N=3. Maybe they already use some encoding for rotation (e.g. rodriguez or euler?).
@Rookfighter thank you very much, yes looking at the paper, this was also my takeaway that the error function essentially is not checking whether the rotation matrix is orthonormal or not.
I will try out fval = f_R
and get back. Regarding N = 3
the paper did not explicitly state it, I manually worked them out but I may have been wrong somewhere.
Many thanks for your continued support. I will report back once I try out some of your suggestions
@Rookfighter if you would take a look at Equation 35 from the paper, the jacobian matrix they present, does come out to 18 x 3. Would you take a look at it so that I can confirm, I did not get my math wrong.
@Mechazo11 I have created a fully working example, which compiles and converges against a valid rotation matrix.
/// levenberg_marquardt.cpp
///
/// Created on: 11 Nov 2020
/// Author: Fabian Meyer
/// License: MIT
#include <lsqcpp/lsqcpp.hpp>
// Implement an objective functor.
struct Objective {
static constexpr bool ComputesJacobian = false;
Eigen::MatrixXd H;
Eigen::Matrix<double, 6, 1> compute_V2r(const double x, const double y,
const double z) const {
// For one row of Rstar, builds its corresponding V_2(r) vector based on
// Equation 11 Const correctness will stop if we try to modify the "member"
// variables, not reference variables Returns None
Eigen::Matrix<double, 6, 1> result;
result << std::pow(x, 2), std::pow(y, 2), std::pow(z, 2), x * y, x * z,
y * z;
return result;
}
void operator()(const Eigen::VectorXd &xval, Eigen::VectorXd &fval) const {
// Compute error term and jacobian
const auto rot = lsqcpp::parameter::decodeRotation(xval);
// Const cast decision variables for use
// TODO these maybe redundant, comeback to these later
const double r11 = rot(0, 0);
const double r12 = rot(0, 1);
const double r13 = rot(0, 2);
const double r21 = rot(1, 0);
const double r22 = rot(1, 1);
const double r23 = rot(1, 2);
const double r31 = rot(2, 0);
const double r32 = rot(2, 1);
const double r33 = rot(2, 2);
// Compute three V2r vectors based on current rotation matrix R
// 18 x 1, concatenate three 6x1 vectors, Equation 32
fval.resize(18);
fval << H * compute_V2r(r11, r12, r13), H * compute_V2r(r21, r22, r23),
H * compute_V2r(r31, r32, r33);
}
};
int main() {
Objective objective;
objective.H.resize(6, 6);
objective.H << 1.82574e-06, 1.82574e-06, 1.82574e-06, -1.20712e-21,
1.7249e-21, -1.48388e-21, 0.411329, -0.410738, -0.000590934, 0.0341389,
-0.44906, 0.477684, -0.349149, -0.314978, 0.664127, 0.713277, -0.246387,
-0.251963, 0.0660712, 0.198444, -0.264515, 1.12855, 1.10806, 1.07442,
-1.44781, 0.0178776, 1.42994, -1.37306, 0.262784, 1.60901, -0.980976,
1.77531, -0.794333, 0.666025, -1.76797, 0.660609;
lsqcpp::LevenbergMarquardtX<double, Objective> optimizer;
// Set number of iterations as stop criterion.
optimizer.setMaximumIterations(100);
// Set the minimum length of the gradient.
optimizer.setMinimumGradientLength(1e-6);
// Set the minimum length of the step.
optimizer.setMinimumStepLength(1e-6);
// Set the minimum least squares error.
optimizer.setMinimumError(0);
// Set the parameters of the step method (Levenberg Marquardt).
optimizer.setMethodParameters({1.0, 2.0, 0.5, 100});
optimizer.setObjective(objective);
// Turn verbosity on, so the optimizer prints status updates after each
// iteration.
optimizer.setVerbosity(4);
// Set initial guess.
Eigen::VectorXd initialGuess(3);
initialGuess << lsqcpp::parameter::encodeRotation(
Eigen::Matrix3d::Identity());
// Start the optimization.
auto result = optimizer.minimize(initialGuess);
std::cout << "Done! Converged: " << (result.converged ? "true" : "false")
<< " Iterations: " << result.iterations << std::endl;
// do something with final function value
std::cout << "Final fval: " << result.fval.transpose() << std::endl;
// do something with final x-value
std::cout << "Final xval: " << result.xval.transpose() << std::endl;
const auto rot = lsqcpp::parameter::decodeRotation(result.xval);
std::cout << "Final rot: \n" << rot << std::endl;
return 0;
}
However, it uses a numeric jacobian and encodes / decodes the rotation.
I looked into the paper, but I cannot directly tell what the expected dimension of the jacobian is supposed to be. I guess this is the part of the paper which makes implementing it hard :D Though the example above should get you going and you can verify your analytical jacobian by using the computed numeric jacobain.
@Rookfighter thank you very much, I greatly appreciate the fully working example. I am currently thinking through a quaternion based version to enforce orthogonality but will definitely try your method too and compare and contrast between the two. I will go ahead and close this thread for now
Alright :+1: Feel free to ask again if you run into trouble. One final note on quaternions: they behave better than rotation matrices for numerical optimization, but they are still over parametrized as rotations have 3 degrees of freedom, but quaternions consist of 4 parameters. Basically the optimizer cannot enforce the unit length on them, so there is still some ambiguity. but if your initial guess is good enough, it should still work.
@Rookfighter when it is convenient, would you kindly take look at my other issue regarding adding a penalty term (or a constraint) to the objective function? I got the quaternion version to work to some extent but I need to enforce $||q|| = 1$ to get a meaningful result.
With many thanks @Mechazo11
Hello @Rookfighter,
Greatly appreciate your header-only non linear solver library that you have shared with the community. I have tested your method for a 2D rosenbrock function and can confirm the library works correctly.
For a robotics vision project, I am in the process of implementing a fast manhattan frame estimation method as described in this paper Fast MF
If you take a look at the paper, the non linear objective function described in Equation 36 requires computing an least squares error function $E(R) = f^T(R)f(R)$ where $R \in SO(3)$ and $f(R)$ is a function of a matrix $H$ and column vector $V_2(r_i)$ s.t $i = {1,2,3}$.
I have figured out how to compute $H$ but from the examples of levenberg marquardt algorithm you have provided in least-squares-cpp, the objective function, it appears that an object from
ObjectiveFunctor
cannot be initialized separately before passing it tolsqcpp::LevenbergMarquardtX<double, ObjectiveFunctor> optimizer;
My questions are in two parts
Is there a way I can initialize an object from the
ObjectiveFunctor
struct
before sending it tolsqcpp::LevenbergMarquardtX<double, ObjectiveFunctor>
? The reason I asked is to verify whether this is possible so that I can initialize an object from ObjectiveFunctor and pass the $H$ matrix to it.How do I define an explicit jacobian that has a fixed known dimension and not
Eigen::Matrix<Scalar, Outputs, Inputs> &jacobian)
?To clarify my questions here is a pseudocode of what I want to achieve. The equations below are in reference to the paper linked above
The problem description may not be very clear but I tried my best to frame the problem so that it aligns with how you have written the library.
Will greatly appreciate if you can help me with this implementation problem.
Looking forward to your kind feedback.
With best regards, @Mechazo11