Simple-Robotics / proxsuite

The Advanced Proximal Optimization Toolbox
BSD 2-Clause "Simplified" License
414 stars 50 forks source link

Assert in PrimalLDLT backend #348

Closed gergondet-woven closed 3 months ago

gergondet-woven commented 3 months ago

Hi folks,

First of thanks for all the work that's been put into this library.

I came across a triggering assert while working with the library. I managed to reduce this to the following program:

#include <iostream>
#include <proxsuite/proxqp/dense/dense.hpp>
#include <proxsuite/proxqp/sparse/sparse.hpp>
#include <proxsuite/proxqp/utils/random_qp_problems.hpp>

using namespace proxsuite;
using namespace proxsuite::proxqp;
using T = double;

int main() {
  isize dim = 3;
  isize n_eq(0);
  isize n_in(9);
  // generate a random qp
  T sparsity_factor(1);
  T strong_convexity_factor(1.e-2);
  dense::Model<T> qp_random =
      utils::dense_strongly_convex_qp(dim, n_eq, n_in, sparsity_factor, strong_convexity_factor);

  auto qp =
      std::make_unique<dense::QP<T>>(dim, n_eq, n_in, false, proxsuite::proxqp::HessianType::Dense,
                                     proxsuite::proxqp::DenseBackend::PrimalLDLT);
  qp->settings.compute_timings = true;

  std::cout << "H:\n" << qp_random.H << "\n";
  std::cout << "g:\n" << qp_random.g << "\n";
  std::cout << "C:\n" << qp_random.C << "\n";
  std::cout << "u:\n" << qp_random.u << "\n";

  qp->init(qp_random.H, qp_random.g, std::nullopt, std::nullopt, qp_random.C, qp_random.u, std::nullopt);

  qp->solve();
  if (qp->results.info.status == QPSolverOutput::PROXQP_SOLVED) {
    std::cout << "Solved!\n";
  } else {
    std::cout << "Failed to solve!\n";
  }
  std::cout << "optimal x: " << qp->results.x.transpose() << std::endl;
  std::cout << "optimal y: " << qp->results.y.transpose() << std::endl;
  std::cout << "optimal z: " << qp->results.z.transpose() << std::endl;
}

I'm compiling with CMake:

cmake_minimum_required(VERSION 3.1)

set(CMAKE_CXX_STANDARD 17)

project(sample LANGUAGES CXX)

find_package(proxsuite REQUIRED)

add_executable(sample sample.cpp)
target_link_libraries(sample PUBLIC proxsuite::proxsuite)

The generated matrices are:

H:
 0.398707 -0.590037         0
-0.590037    1.5455 -0.920334
        0 -0.920334   1.33377
g:
 -1.58215
0.0932904
-0.280619
C:
-0.862572 -0.260127  0.795916
 -0.46158 -0.666192  0.304589
  1.19822 -0.685532 -0.516013
-0.618156 -0.124636 -0.568752
  1.56766 -0.677021 -0.407359
 0.337164 -0.952763   1.58797
  0.84881 -0.851994  -1.01457
-0.898539  0.269734   2.01888
-0.625312  -0.32846  0.248054
u:
   2.19324
   1.81411
   -1.8286
   2.89715
  -3.17692
  -1.27243
-0.0402822
  0.594517
   1.47721

In RelWithDebInfo/Release, this works just fine:

optimal x:  -3.37224  -2.59909 -0.859144
optimal y:
optimal z: -3.43738e-24            0  6.42284e-22     -3.87512     -1.74049 -1.46368e-24            0     -1.92111            0

However in Debug, this raises an assert:

sample: /usr/include/eigen3/Eigen/src/Core/DenseCoeffsBase.h:181: Eigen::DenseCoeffsBase<Derived, 0>::CoeffReturnType Eigen::DenseCoeffsBase<Derived, 0>::operator()(Eigen::Index) const [with Derived = Eigen::Ref<const Eigen::Matrix<double, -1, 1>, 0, Eigen::InnerStride<1> >; Eigen::DenseCoeffsBase<Derived, 0>::CoeffReturnType = double; Eigen::Index = long int]: Assertion `index >= 0 && index < size()' failed

Which happens in the following call stack:

#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737352591168)
    at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140737352591168) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140737352591168, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7842476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff78287f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff782871b in __assert_fail_base (
    fmt=0x7ffff79dd130 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=0x555555a9f59d "index >= 0 && index < size()",
    file=0x555555a9f568 "/usr/include/eigen3/Eigen/src/Core/DenseCoeffsBase.h", line=181,
    function=<optimized out>) at ./assert/assert.c:92
#6  0x00007ffff7839e96 in __GI___assert_fail (
    assertion=0x555555a9f59d "index >= 0 && index < size()",
    file=0x555555a9f568 "/usr/include/eigen3/Eigen/src/Core/DenseCoeffsBase.h", line=181,
    function=0x555555aa90e8 "Eigen::DenseCoeffsBase<Derived, 0>::CoeffReturnType Eigen::DenseCoeffsBase<Derived, 0>::operator()(Eigen::Index) const [with Derived = Eigen::Ref<const Eigen::Matrix<double, -1, 1>, 0, Eigen::InnerStr"...) at ./assert/assert.c:101
#7  0x00005555559644ad in Eigen::DenseCoeffsBase<Eigen::Ref<Eigen::Matrix<double, -1, 1, 0, -1, 1> const, 0, Eigen::InnerStride<1> >, 0>::operator() (this=0x7fffffffd060, index=3)
    at /usr/include/eigen3/Eigen/src/Core/DenseCoeffsBase.h:181
#8  0x000055555598a894 in proxsuite::linalg::dense::Ldlt<double>::rank_r_update (this=0x555555ba76b8,
    w=..., alpha=..., stack=...) at /tmp/install/include/proxsuite/linalg/dense/ldlt.hpp:598
#9  0x0000555555983089 in proxsuite::proxqp::dense::mu_update<double> (qpmodel=..., qpresults=...,
    qpwork=..., n_constraints=9,
    dense_backend=@0x555555ba7320: proxsuite::proxqp::DenseBackend::PrimalLDLT, mu_eq_new=0.0001,
    mu_in_new=0.010000000000000002) at /tmp/install/include/proxsuite/proxqp/dense/solver.hpp:214
#10 0x000055555597325c in proxsuite::proxqp::dense::qp_solve<double> (qpsettings=..., qpmodel=...,
    qpresults=..., qpwork=..., box_constraints=false,
    dense_backend=@0x555555ba7320: proxsuite::proxqp::DenseBackend::PrimalLDLT,
    hessian_type=@0x555555ba7328: proxsuite::proxqp::HessianType::Dense, ruiz=...)
    at /tmp/install/include/proxsuite/proxqp/dense/solver.hpp:1731
#11 0x000055555596c0dd in proxsuite::proxqp::dense::QP<double>::solve (this=0x555555ba7320)
    at /tmp/install/include/proxsuite/proxqp/dense/wrapper.hpp:924
#12 0x0000555555963d9a in main () at /home/pierre-gergondet/devel/sanbox/proxsuite/sample.cpp:3

The issue appears to be from here:

qpwork.ldl.rank_r_update(
          new_cols, qpwork.dw_aug.head(qpmodel.dim), stack);

(I'm guessing line 222 could generate the same kind of issue?)

The number of rows in new_cols are used to iterate over qpwork.dw_aug.head(qpmodel.dim) so if newcols has more rows than qpmodel.dim this triggers an assert. I'm not knowledgeable enough about the underlying algorithm but perhaps changing the dimension passed to head to match new_cols dimension is enough here?

No alarm is being tripped when asserts are disabled because qpwork.dw_aug has enough rows that it doesn't run over the available memory when accessing outside the size of the view range.

I have confirmed this happens with both ros-humble-proxsuite 0.6.5-1jammy.20240728.192252 and the latest version in main.

There's also no issue if the PrimalDualLDLT backend is used.

Please let me know if you need more information.

fabinsch commented 3 months ago

Hi @gergondet-woven,

thanks for your interest in Proxsuite and for raising this issue with a detailed analysis. You are right, there is indeed a mismatch. I discussed it with @Bambade and you can fix it by changing the dimension passed to head to match new_cols (by using qpwork.n_c instead of qpmodel.dim).

         qpwork.ldl.rank_r_update(
-          new_cols, qpwork.dw_aug.head(qpmodel.dim), stack);
+          new_cols, qpwork.dw_aug.head(qpwork.n_c), stack);
       }

The same is true for l.222, where it should be qpmodel.n_eq.

We will provide a PR with this fix very soon, and we'll add a unittest for the PrimalLDLT backend.

fabinsch commented 3 months ago

I also just realized that the bounds are swapped in the code — qp.init(H, g, A, b, C, u, l) instead of qp.init(H, g, A, b, C, l, u).

Probably doesn't matter much for this example/issue, but thought I'd mention it to avoid any mix-ups. 😊

gergondet-woven commented 3 months ago

Hi @fabinsch

Thanks for the quick feedback.

I also just realized that the bounds are swapped in the code — qp.init(H, g, A, b, C, u, l) instead of qp.init(H, g, A, b, C, l, u).

This is on purpose, the assert does not trigger with the bounds in the "correct" order. There's probably a better way to consistently trigger it but I didn't manage to find one :(

jcarpent commented 3 months ago

Fixed via #349.