stack-of-tasks / pinocchio

A fast and flexible implementation of Rigid Body Dynamics algorithms and their analytical derivatives
http://stack-of-tasks.github.io/pinocchio/
BSD 2-Clause "Simplified" License
1.82k stars 383 forks source link

test-cpp-contact-cholesky failure with GCC 13.3.0 #2304

Closed nim65s closed 2 weeks ago

nim65s commented 3 months ago

Hi,

This is a preliminary bug report, I'm trying a few things to check that.

But so far, with nix, since gcc was upgraded from 13.2.0 to 13.3.0, test-cpp-contact-cholesky fail with big differences in some matrices (ie. 0.207 != 5e-17).

I suspect this is an issue in GCC, so I will deactivate this test on nix, and try to make a proper bug report to GCC.

jcarpent commented 3 months ago

Same version of Eigen?

nim65s commented 3 months ago

yes

nim65s commented 3 months ago

And, only on aarch64:

27/89 Test #30: test-cpp-contact-models .....................***Failed    0.12 sec
Running 2 test cases...
/build/source/unittest/contact-models.cpp(164): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.col(k).isZero() != cm_RF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(165): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.col(k).isZero() != cm_LF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(164): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.col(k).isZero() != cm_RF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(165): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.col(k).isZero() != cm_LF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(164): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.col(k).isZero() != cm_RF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(165): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.col(k).isZero() != cm_LF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(164): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.col(k).isZero() != cm_RF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(165): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.col(k).isZero() != cm_LF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(164): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.col(k).isZero() != cm_RF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(165): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.col(k).isZero() != cm_LF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(185): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.isApprox(J_RF_LOCAL_sparse) has failed
/build/source/unittest/contact-models.cpp(191): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.isApprox(J_LF_LOCAL_sparse) has failed
/build/source/unittest/contact-models.cpp(233): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.col(k).isZero() != cm_RF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(234): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.col(k).isZero() != cm_LF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(233): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.col(k).isZero() != cm_RF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(234): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.col(k).isZero() != cm_LF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(233): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.col(k).isZero() != cm_RF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(234): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.col(k).isZero() != cm_LF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(233): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.col(k).isZero() != cm_RF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(234): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.col(k).isZero() != cm_LF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(233): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.col(k).isZero() != cm_RF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(234): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.col(k).isZero() != cm_LF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(257): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.isApprox(J_RF_LWA_sparse) has failed
/build/source/unittest/contact-models.cpp(263): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.isApprox(J_LF_LWA_sparse) has failed
/build/source/unittest/contact-models.cpp(296): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_RF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(299): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_LF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(296): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_RF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(299): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_LF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(296): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_RF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(299): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_LF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(296): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_RF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(299): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_LF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(296): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_RF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(299): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_LF_LOCAL.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(314): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LOCAL.col(k).isZero(0) != within(k, clm_RF_LF_LOCAL.colwise_span_indexes) has failed
/build/source/unittest/contact-models.cpp(314): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LOCAL.col(k).isZero(0) != within(k, clm_RF_LF_LOCAL.colwise_span_indexes) has failed
/build/source/unittest/contact-models.cpp(314): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LOCAL.col(k).isZero(0) != within(k, clm_RF_LF_LOCAL.colwise_span_indexes) has failed
/build/source/unittest/contact-models.cpp(314): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LOCAL.col(k).isZero(0) != within(k, clm_RF_LF_LOCAL.colwise_span_indexes) has failed
/build/source/unittest/contact-models.cpp(314): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LOCAL.col(k).isZero(0) != within(k, clm_RF_LF_LOCAL.colwise_span_indexes) has failed
/build/source/unittest/contact-models.cpp(322): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LOCAL.middleRows<3>(SE3::LINEAR).isApprox(J_RF_LOCAL_sparse) has failed
/build/source/unittest/contact-models.cpp(328): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LOCAL.middleRows<3>(SE3::LINEAR).isApprox(J_LF_LOCAL_sparse) has failed
/build/source/unittest/contact-models.cpp(334): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LOCAL.isApprox(J_clm_LOCAL_sparse) has failed
/build/source/unittest/contact-models.cpp(110): error: in "Test/contact_models_sparsity_and_jacobians": check J.isApprox(J_ref) has failed
/build/source/unittest/contact-models.cpp(110): error: in "Test/contact_models_sparsity_and_jacobians": check J.isApprox(J_ref) has failed
/build/source/unittest/contact-models.cpp(110): error: in "Test/contact_models_sparsity_and_jacobians": check J.isApprox(J_ref) has failed
/build/source/unittest/contact-models.cpp(374): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_RF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(377): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_LF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(374): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_RF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(377): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_LF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(374): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_RF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(377): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_LF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(374): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_RF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(377): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_LF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(374): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_RF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(377): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.middleRows<3>(SE3::LINEAR).col(k).isZero() != cm_LF_LWA.colwise_joint1_sparsity[k] has failed
/build/source/unittest/contact-models.cpp(394): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LWA.col(k).isZero(0) != within(k, clm_RF_LF_LWA.colwise_span_indexes) has failed
/build/source/unittest/contact-models.cpp(394): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LWA.col(k).isZero(0) != within(k, clm_RF_LF_LWA.colwise_span_indexes) has failed
/build/source/unittest/contact-models.cpp(394): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LWA.col(k).isZero(0) != within(k, clm_RF_LF_LWA.colwise_span_indexes) has failed
/build/source/unittest/contact-models.cpp(394): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LWA.col(k).isZero(0) != within(k, clm_RF_LF_LWA.colwise_span_indexes) has failed
/build/source/unittest/contact-models.cpp(394): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LWA.col(k).isZero(0) != within(k, clm_RF_LF_LWA.colwise_span_indexes) has failed
/build/source/unittest/contact-models.cpp(402): error: in "Test/contact_models_sparsity_and_jacobians": check J_RF_LWA.middleRows<3>(SE3::LINEAR).isApprox(J_RF_LWA_sparse) has failed
/build/source/unittest/contact-models.cpp(408): error: in "Test/contact_models_sparsity_and_jacobians": check J_LF_LWA.middleRows<3>(SE3::LINEAR).isApprox(J_LF_LWA_sparse) has failed
/build/source/unittest/contact-models.cpp(414): error: in "Test/contact_models_sparsity_and_jacobians": check J_clm_LWA.isApprox(J_clm_LWA_sparse) has failed
*** 63 failures are detected in the test module "Test"

Not sure if related to #2277 or not, but we'll deactivate also this one, for aarch64, for now.

cmastalli commented 3 months ago

I would check if we're initializing all matrices/vectors with zeros.

nim65s commented 3 months ago

I think I can confirm this is an issue in GCC 13.3.0 with this reproducer:

FROM debian

RUN --mount=type=cache,sharing=locked,target=/var/cache/apt \
    --mount=type=cache,sharing=locked,target=/var/lib/apt \
    apt-get update -y && DEBIAN_FRONTEND=noninteractive apt-get install -qqy --no-install-recommends \
    autoconf \
    build-essential \
    bzip2 \
    cmake \
    g++-multilib \
    gcc-multilib \
    git \
    libboost-all-dev \
    libeigen3-dev \
    liburdfdom-dev \
    libtinyxml-dev \
    make \
    wget \
    xz-utils

ARG GCC_VERSION=13.2.0
ENV GCC_VERSION=$GCC_VERSION

WORKDIR /src
ADD https://gmplib.org/download/gmp/gmp-6.3.0.tar.xz .
ADD https://www.mpfr.org/mpfr-current/mpfr-4.2.1.tar.xz .
ADD https://ftp.gnu.org/gnu/mpc/mpc-1.3.1.tar.gz .
ADD https://gcc.gnu.org/pub/gcc/infrastructure/isl-0.24.tar.bz2 .
RUN tar xf gmp* \
 && tar xf mpfr* \
 && tar xf mpc* \
 && tar xf isl*
ADD https://gcc.gnu.org/pub/gcc/releases/gcc-$GCC_VERSION/gcc-$GCC_VERSION.tar.xz .
RUN tar xf gcc-$GCC_VERSION.tar.xz
WORKDIR gcc-$GCC_VERSION
RUN mv ../gmp-6.3.0 gmp \
 && mv ../mpfr-4.2.1 mpfr \
 && mv ../mpc-1.3.1 mpc \
 && mv ../isl-0.24 isl
RUN ./configure --enable-languages=c,c++ \
 && make -j 8 \
 && make -j 8 install

WORKDIR /src
ADD https://github.com/stack-of-tasks/pinocchio/releases/download/v3.0.0/pinocchio-3.0.0.tar.gz .
RUN tar xf pinocchio-3.0.0.tar.gz
WORKDIR pinocchio-3.0.0

RUN cmake -B build -S .  -DCMAKE_BUILD_TYPE=Release -DBUILD_PYTHON_INTERFACE=OFF -DCMAKE_CXX_STANDARD=14
RUN cmake --build build -j 4

ENV LD_LIBRARY_PATH=/usr/local/lib64
#CMD ./build/unittest/test-cpp-contact-cholesky

running ./build/unittest/test-cpp-contact-cholesky in containers build with either docker build --build-arg GCC_VERSION=13.2.0 . or docker build --build-arg GCC_VERSION=13.3.0 . will show that it is fine in the first case, and raise these errors in the second:


Running 10 test cases...
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(359): error: in "Test/contact_cholesky_contact6D_LOCAL": check H_recomposed.topRightCorner(constraint_dim, model.nv) .isApprox(H.topRightCorner(constraint_dim, model.nv)) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(361): error: in "Test/contact_cholesky_contact6D_LOCAL": check H_recomposed.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(372): error: in "Test/contact_cholesky_contact6D_LOCAL": check JMinv_ref.isApprox(JMinv_test) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(374): error: in "Test/contact_cholesky_contact6D_LOCAL": check iosim.isApprox(JMinvJt) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(375): error: in "Test/contact_cholesky_contact6D_LOCAL": check osim.isApprox(JMinvJt.inverse()) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(418): error: in "Test/contact_cholesky_contact6D_LOCAL": check iosim_mu.isApprox(JMinvJt_mu) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(419): error: in "Test/contact_cholesky_contact6D_LOCAL": check osim_mu.isApprox(JMinvJt_mu.inverse()) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(438): error: in "Test/contact_cholesky_contact6D_LOCAL": check H_recomposed_mu.isApprox(H_mu) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(510): error: in "Test/contact_cholesky_contact6D_LOCAL": check sol.isApprox(sol_ref) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(517): error: in "Test/contact_cholesky_contact6D_LOCAL": check sol_mat.isApprox(sol_mat_ref) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(528): error: in "Test/contact_cholesky_contact6D_LOCAL": check H_inv.isApprox(H_inv_ref) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(533): error: in "Test/contact_cholesky_contact6D_LOCAL": check mat1.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(537): error: in "Test/contact_cholesky_contact6D_LOCAL": check mat2.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(540): error: in "Test/contact_cholesky_contact6D_LOCAL": check mat3.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(646): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check H_recomposed.topRightCorner(constraint_dim, model.nv) .isApprox(H.topRightCorner(constraint_dim, model.nv)) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(648): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check H_recomposed.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(720): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check sol.isApprox(sol_ref) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(727): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check sol_mat.isApprox(sol_mat_ref) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(738): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check H_inv.isApprox(H_inv_ref) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(743): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check mat1.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(747): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check mat2.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(750): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check mat3.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(758): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check iosim.isApprox(JMinvJt) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(759): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check osim.isApprox(JMinvJt.inverse()) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(769): error: in "Test/contact_cholesky_contact3D_6D_LOCAL": check JMinv_ref.isApprox(JMinv_test) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(838): error: in "Test/contact_cholesky_contact6D_LOCAL_WORLD_ALIGNED": check H_recomposed.topRightCorner(constraint_dim, model.nv) .isApprox(H.topRightCorner(constraint_dim, model.nv)) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(840): error: in "Test/contact_cholesky_contact6D_LOCAL_WORLD_ALIGNED": check H_recomposed.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(855): error: in "Test/contact_cholesky_contact6D_LOCAL_WORLD_ALIGNED": check iosim.isApprox(JMinvJt) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(856): error: in "Test/contact_cholesky_contact6D_LOCAL_WORLD_ALIGNED": check osim.isApprox(JMinvJt.inverse()) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(866): error: in "Test/contact_cholesky_contact6D_LOCAL_WORLD_ALIGNED": check JMinv_ref.isApprox(JMinv_test) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(953): error: in "Test/contact_cholesky_contact6D_by_joint_2": check H_recomposed.topRightCorner(constraint_dim, model.nv) .isApprox(H.topRightCorner(constraint_dim, model.nv)) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(955): error: in "Test/contact_cholesky_contact6D_by_joint_2": check H_recomposed.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1063): error: in "Test/contact_cholesky_contact6D_by_joint_2": check iosim.isApprox(JMinvJt) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1064): error: in "Test/contact_cholesky_contact6D_by_joint_2": check osim.isApprox(JMinvJt.inverse()) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1075): error: in "Test/contact_cholesky_contact6D_by_joint_2": check JMinv_ref.isApprox(JMinv_test) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1172): error: in "Test/contact_cholesky_contact3D_6D_WORLD_by_joint_2": check H_recomposed.topRightCorner(constraint_dim, model.nv) .isApprox(H.topRightCorner(constraint_dim, model.nv)) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1174): error: in "Test/contact_cholesky_contact3D_6D_WORLD_by_joint_2": check H_recomposed.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1182): error: in "Test/contact_cholesky_contact3D_6D_WORLD_by_joint_2": check iosim.isApprox(JMinvJt) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1183): error: in "Test/contact_cholesky_contact3D_6D_WORLD_by_joint_2": check osim.isApprox(JMinvJt.inverse()) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1194): error: in "Test/contact_cholesky_contact3D_6D_WORLD_by_joint_2": check JMinv_ref.isApprox(JMinv_test) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1448): error: in "Test/loop_contact_cholesky_contact_3d": check H_recomposed.topRightCorner(constraint_dim, model.nv) .isApprox(H.topRightCorner(constraint_dim, model.nv)) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1450): error: in "Test/loop_contact_cholesky_contact_3d": check H_recomposed.isApprox(H) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1472): error: in "Test/loop_contact_cholesky_contact_3d": check iosim.isApprox(JMinvJt) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1473): error: in "Test/loop_contact_cholesky_contact_3d": check osim.isApprox(JMinvJt.inverse()) has failed
/src/pinocchio-3.0.0/unittest/contact-cholesky.cpp(1483): error: in "Test/loop_contact_cholesky_contact_3d": check JMinv_ref.isApprox(JMinv_test) has failed

*** 45 failures are detected in the test module "Test"
nim65s commented 3 months ago

the top right corners in l. 359 start with:

-0.945445  5.55112e-17  1.38778e-17 -9.71445e-17  1.11022e-16  1.38778e-17            0

vs

-0.945445      0.20705    -0.251524    -0.355316     -1.20556      0.34319            0
nim65s commented 2 months ago

Taking another look at this, it seems we have a -O3 issue. -O3 is set by CMake for -DCMAKE_BUILD_TYPE=Release, and when I switch this to -O2, everything works fine again, both with GCC 13.2.0 and 13.3.0:

RUN cmake -B build -S .  -DCMAKE_BUILD_TYPE=Release -DBUILD_PYTHON_INTERFACE=OFF -DCMAKE_CXX_STANDARD=14
RUN sed -i 's/-O3/-O2/' build/CMakeCache.txt
RUN cmake --build build -j 4 -t test-cpp-contact-cholesky

ENV LD_LIBRARY_PATH=/usr/local/lib64
RUN ./build/unittest/test-cpp-contact-cholesky

-O3 is often discouraged for this kind of reasons, ref. https://wiki.archlinux.org/title/CMake_package_guidelines#Notes_about_-O3

Digging further, looking at https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-O3, I tried to replace the failing -O3 -DNDEBUG by -O2 -DNDEBUG -fgcse-after-reload -fipa-cp-clone -floop-interchange -floop-unroll-and-jam -fpeel-loops -fpredictive-commoning -fsplit-loops -fsplit-paths -ftree-loop-distribution -ftree-partial-pre -funswitch-loops -fvect-cost-model=dynamic -fversion-loops-for-strides, but it does not fail… So I'm not sure where to go from here.

The good solution would obviously be to trash all our C/C++ code base and start again in a sane language.

But this might not be on the roadmap for now, so I guess I'll just go for -O2 in nix.

jorisv commented 3 weeks ago

Conda-forge is now using gcc 13.3 as default and we can now reproduce this issue.

Here all the failing tests:

jorisv commented 3 weeks ago

As stated here -On is not equivalent to using all the -f optimization flag. There is some unamed optimization activated by the -On options.

nim65s commented 3 weeks ago

Ok, thanks. It looks once again like a GCC issue then. But to report that we should probably work on a MRE. For that we can try to dump the matrices used in one of those failing tests.

jorisv commented 3 weeks ago

I'm working on it :)

jorisv commented 3 weeks ago

Note for tomorrow:

jorisv commented 3 weeks ago

I found some piece of code that run differently between O2 and O3 with g++ 13.3 : https://github.com/stack-of-tasks/pinocchio/blob/master/include/pinocchio/algorithm/contact-info.hpp#L747-L750 By changing the code, I'm able to make impulse-dynamics-derivatives, contact-dynamics-derivatives and contact-dynamics pass with G++ 13.3.0. contact-cholesky still fail.

          for (int k = 0; k < joint1.nv(); ++k, ++current1_col_id)
          {
            colwise_joint1_sparsity[current1_col_id] = true;
          }

This loop doesn't run in O3 for the first joint. Then, colwise_joint1_sparsity is not well initialized. If I replace this loop by a colwise_joint1_sparsity.segment(...).fill(true) the code is working in O2 and O3.

Also, if I remove the following else clause the code is also working in O2 and O3 (in the case the else clause is not needed).

        else
        {
          const JointModel & joint2 = model.joints[current2_id];
          joint2_span_indexes.push_back((Eigen::DenseIndex)current2_id);
          Eigen::DenseIndex current2_col_id = joint2.idx_v();
          for (int k = 0; k < joint2.nv(); ++k, ++current2_col_id)
          {
            colwise_joint2_sparsity[current2_col_id] = true;
          }
          current2_id = model.parents[current2_id];
        }

@nim65s, @jcarpent do you see a particular undefined behavior that can explain why G++ don't run the last iteration of the for loop when the else clause is here ?

I think I have three options:

I think I will try the last option first, but I'm afraid it will not be so easy, maybe the optimization is activated because of some previous code (contact-info.hpp code is all inline).

jorisv commented 3 weeks ago

As I feared it will be difficult to have a MRE. The following doesn't reproduce the bug.

MRE.cpp

#include <cstddef>
#include <iostream>
#include <Eigen/Core>
#include <vector>

extern std::size_t JOINT1_ID;
extern std::size_t JOINT2_ID;
extern int NV;
extern int NJOINT;
extern int JOINT_IDX_V[];
extern int JOINT_NV[];
extern int JOINT_PARENTS[];
void printVector(const Eigen::VectorX<bool>& v);

int main()
{
  Eigen::VectorX<bool> colwise_joint1_sparsity(NV);
  Eigen::VectorX<bool> colwise_joint2_sparsity(NV);
  std::vector<int>    joint1_span_indexes;
  joint1_span_indexes.reserve(NJOINT);
  std::vector<int>    joint2_span_indexes;
  joint2_span_indexes.reserve(NJOINT);
  static const bool default_sparsity_value = false;
  colwise_joint1_sparsity.fill(default_sparsity_value);
  colwise_joint2_sparsity.fill(default_sparsity_value);

  std::size_t current1_id = 0;
  if (JOINT1_ID > 0)
    current1_id = JOINT1_ID;

  std::size_t current2_id = 0;
  if (JOINT2_ID > 0)
    current2_id = JOINT2_ID;

  while (current1_id != current2_id)
  {
    if (current1_id > current2_id)
      {
        int current1_col_id = JOINT_IDX_V[current1_id];
        joint1_span_indexes.push_back(current1_id);
        for (int k = 0; k < JOINT_NV[current1_id]; ++k, ++current1_col_id)
        {
          colwise_joint1_sparsity[current1_col_id] = true;
        }
        current1_id = JOINT_PARENTS[current1_id];
      }
    else
    {
      int current2_col_id = JOINT_IDX_V[current2_id];
      joint2_span_indexes.push_back(current2_id);
      for (int k = 0; k < JOINT_NV[current2_id]; ++k, ++current2_col_id)
      {
        colwise_joint2_sparsity[current2_col_id] = true;
      }
      current2_id = JOINT_PARENTS[current2_id];
    }
  }

  printVector(colwise_joint1_sparsity);
}

MRE_module.cpp

#include <cstddef>
#include <iostream>
#include <Eigen/Core>

std::size_t JOINT1_ID = 5;
std::size_t JOINT2_ID = 0;
int NV = 10;
int NJOINT = 5;

int JOINT_IDX_V[] = {0, 0, 6, 7, 8, 9};
int JOINT_NV[] = {0, 6, 1, 1, 1, 1};
int JOINT_PARENTS[] = {0, 0, 1, 2, 3, 4};

void printVector(const Eigen::VectorX<bool>& v)
{
  std::cout << v.transpose() << std::endl;
}

compile.sh

#! /bin/sh
#

FLAGS="-O3 -DNDEBUG -std=gnu++17"
g++ $FLAGS -c MRE_module.cpp -o MRE_module.o -isystem $CONDA_PREFIX/include/eigen3/
g++ $FLAGS -c MRE.cpp -o MRE.o -isystem $CONDA_PREFIX/include/eigen3/
g++ $FLAGS MRE.o MRE_module.o -o mre
nim65s commented 3 weeks ago

Current devel:

$ cmake --build build -t pinocchio-test-cpp-contact-cholesky && ./build/unittest/pinocchio-test-cpp-contact-cholesky
Running 10 test cases...
[…]
*** 45 failures are detected in the test module "Test"

caching nv with:

--- a/include/pinocchio/algorithm/contact-info.hpp
+++ b/include/pinocchio/algorithm/contact-info.hpp
@@ -742,9 +742,10 @@ namespace pinocchio
         if (current1_id > current2_id)
         {
           const JointModel & joint1 = model.joints[current1_id];
+          const int j1nv = joint1.nv();
           joint1_span_indexes.push_back((Eigen::DenseIndex)current1_id);
           Eigen::DenseIndex current1_col_id = joint1.idx_v();
-          for (int k = 0; k < joint1.nv(); ++k, ++current1_col_id)
+          for (int k = 0; k < j1nv; ++k, ++current1_col_id)
           {
             colwise_joint1_sparsity[current1_col_id] = true;
           }
@@ -753,9 +754,10 @@ namespace pinocchio
         else
         {
           const JointModel & joint2 = model.joints[current2_id];
+          const int j2nv = joint2.nv();
           joint2_span_indexes.push_back((Eigen::DenseIndex)current2_id);
           Eigen::DenseIndex current2_col_id = joint2.idx_v();
-          for (int k = 0; k < joint2.nv(); ++k, ++current2_col_id)
+          for (int k = 0; k < j2nv; ++k, ++current2_col_id)
           {
             colwise_joint2_sparsity[current2_col_id] = true;
           }
@@ -770,10 +772,11 @@ namespace pinocchio
         while (current_id > 0)
         {
           const JointModel & joint = model.joints[current_id];
+          const int jnv = joint.nv();
           joint1_span_indexes.push_back((Eigen::DenseIndex)current_id);
           joint2_span_indexes.push_back((Eigen::DenseIndex)current_id);
           Eigen::DenseIndex current_row_id = joint.idx_v();
-          for (int k = 0; k < joint.nv(); ++k, ++current_row_id)
+          for (int k = 0; k < jnv; ++k, ++current_row_id)
           {
             colwise_joint1_sparsity[current_row_id] = true;
             colwise_joint2_sparsity[current_row_id] = true;

I get:

$ cmake --build build -t pinocchio-test-cpp-contact-cholesky && ./build/unittest/pinocchio-test-cpp-contact-cholesky
Running 10 test cases...

*** No errors detected

There is something fishy in that in .nv() 😅

nim65s commented 3 weeks ago

this seems to fix all those:

    30 - pinocchio-test-cpp-contact-models (Failed)
    36 - pinocchio-test-cpp-impulse-dynamics-derivatives (Failed)
    37 - pinocchio-test-cpp-contact-dynamics-derivatives (Failed)
    39 - pinocchio-test-cpp-impulse-dynamics (Failed)
    71 - pinocchio-test-cpp-contact-cholesky (Failed)
jorisv commented 2 weeks ago

The issue doesn't appear anymore with GCC14.1. So I will close this issue.

I have created a minimal example (but with pinocchio, so it's hard to share with gcc team).

#include <iostream>

#include "pinocchio/multibody/sample-models.hpp"

#include <boost/test/unit_test.hpp>

BOOST_AUTO_TEST_SUITE(BOOST_TEST_MODULE)

using namespace Eigen;
using namespace pinocchio;

template<typename Scalar>
struct ContactModel
{
  JointIndex joint1_id;
  JointIndex joint2_id;
  Eigen::Matrix<bool, Eigen::Dynamic, 1> colwise_joint1_sparsity;
  Eigen::Matrix<bool, Eigen::Dynamic, 1> colwise_joint2_sparsity;

  template<int OtherOptions, template<typename, int> class JointCollectionTpl>
  ContactModel(
    const ModelTpl<Scalar, OtherOptions, JointCollectionTpl> & model,
    const JointIndex joint1_id,
    const JointIndex joint2_id)
  : joint1_id(joint1_id)
  , joint2_id(joint2_id)
  , colwise_joint1_sparsity(model.nv)
  , colwise_joint2_sparsity(model.nv)
  {
    init(model);
  }

  template<int OtherOptions, template<typename, int> class JointCollectionTpl>
  void init(const ModelTpl<Scalar, OtherOptions, JointCollectionTpl> & model)
  {
    typedef ModelTpl<Scalar, OtherOptions, JointCollectionTpl> Model;
    typedef typename Model::JointModel JointModel;

    static const bool default_sparsity_value = false;
    colwise_joint1_sparsity.fill(default_sparsity_value);
    colwise_joint2_sparsity.fill(default_sparsity_value);
    JointIndex current1_id = 0;
    if (joint1_id > 0)
      current1_id = joint1_id;

    JointIndex current2_id = 0;
    if (joint2_id > 0)
      current2_id = joint2_id;

    while (current1_id != current2_id)
    {
      if (current1_id > current2_id)
      {
        const JointModel & joint1 = model.joints[current1_id];
        const int j1nv = joint1.nv();
        Eigen::DenseIndex current1_col_id = joint1.idx_v();
        for (int k = 0; k < j1nv; ++k, ++current1_col_id)
        {
          colwise_joint1_sparsity[current1_col_id] = true;
        }
        current1_id = model.parents[current1_id];
      }
      else
      {
        const JointModel & joint2 = model.joints[current2_id];
        const int j2nv = joint2.nv();
        Eigen::DenseIndex current2_col_id = joint2.idx_v();
        for (int k = 0; k < j2nv; ++k, ++current2_col_id)
        {
          colwise_joint2_sparsity[current2_col_id] = true;
        }
        current2_id = model.parents[current2_id];
      }
    }
  }
};

template<typename Scalar>
struct ContactModelNotWorking
{
  JointIndex joint1_id;
  JointIndex joint2_id;
  Eigen::Matrix<bool, Eigen::Dynamic, 1> colwise_joint1_sparsity;
  Eigen::Matrix<bool, Eigen::Dynamic, 1> colwise_joint2_sparsity;

  template<int OtherOptions, template<typename, int> class JointCollectionTpl>
  ContactModelNotWorking(
    const ModelTpl<Scalar, OtherOptions, JointCollectionTpl> & model,
    const JointIndex joint1_id,
    const JointIndex joint2_id)
  : joint1_id(joint1_id)
  , joint2_id(joint2_id)
  , colwise_joint1_sparsity(model.nv)
  , colwise_joint2_sparsity(model.nv)
  {
    init(model);
  }

  template<int OtherOptions, template<typename, int> class JointCollectionTpl>
  void init(const ModelTpl<Scalar, OtherOptions, JointCollectionTpl> & model)
  {
    typedef ModelTpl<Scalar, OtherOptions, JointCollectionTpl> Model;
    typedef typename Model::JointModel JointModel;

    static const bool default_sparsity_value = false;
    colwise_joint1_sparsity.fill(default_sparsity_value);
    colwise_joint2_sparsity.fill(default_sparsity_value);
    JointIndex current1_id = 0;
    if (joint1_id > 0)
      current1_id = joint1_id;

    JointIndex current2_id = 0;
    if (joint2_id > 0)
      current2_id = joint2_id;

    while (current1_id != current2_id)
    {
      if (current1_id > current2_id)
      {
        const JointModel & joint1 = model.joints[current1_id];
        Eigen::DenseIndex current1_col_id = joint1.idx_v();
        for (int k = 0; k < joint1.nv(); ++k, ++current1_col_id)
        {
          colwise_joint1_sparsity[current1_col_id] = true;
        }
        current1_id = model.parents[current1_id];
      }
      else
      {
        const JointModel & joint2 = model.joints[current2_id];
        Eigen::DenseIndex current2_col_id = joint2.idx_v();
        for (int k = 0; k < joint2.nv(); ++k, ++current2_col_id)
        {
          colwise_joint2_sparsity[current2_col_id] = true;
        }
        current2_id = model.parents[current2_id];
      }
    }
  }
};

BOOST_AUTO_TEST_CASE(test_gcc13_3)
{
  using namespace Eigen;
  using namespace pinocchio;
  typedef JointCollectionDefaultTpl<double, pinocchio::context::Options> JC;
  using buildModels::details::addJointAndBody;
  Inertia Ijoint(.1, Inertia::Vector3::Zero(), Inertia::Matrix3::Identity() * .01);

  Model model;

  auto ffidx = model.addJoint(0, typename JC::JointModelFreeFlyer(), SE3::Identity(), "root_joint");
  model.lowerPositionLimit.template segment<4>(3).fill(-1.);
  model.upperPositionLimit.template segment<4>(3).fill(1.);
  model.appendBodyToJoint(ffidx, Ijoint);
  model.addJointFrame(ffidx);
  buildModels::details::addManipulator(model, ffidx);
  const std::string LF = "wrist2_joint";

  const Model::JointIndex LF_id = model.getJointId(LF);
  ContactModel ci_LF(model, LF_id, 0);
  std::cout << ci_LF.colwise_joint1_sparsity.transpose() << std::endl;
  ContactModelNotWorking ci_LF_not_work(model, LF_id, 0);
  std::cout << ci_LF_not_work.colwise_joint1_sparsity.transpose() << std::endl;
}

BOOST_AUTO_TEST_SUITE_END()