RTabMap + Humble + Torch= True? #1063

mattiasmar commented 1 year ago

Is there a known problem with compiling Rtabmap against ROS2 (humble) together with Torch? I've created a Dockerfile with the latest and greatest dependencies (as far as I could tell from reading a lot of issues and commits in this repo), nevertheless when I compile the rtabmap master branch with TORCH=ON I get a lot of linker errors (undefined reference to...). The rtabmap build configuration is copied below. It looks fine to me, but maybe it isn't.

I would be very grateful for an Ubuntu 22 Dockerfile with ROS2 Humble installation inside that can successfully compile the Rtabmap master branch with Torch enabled.


FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04 as runtime

ARG DEBIAN_FRONTEND=noninteractive

# Uncomment the lines below to use a 3rd party repository
# to get the latest (unstable from mesa/main) mesa library version
RUN apt update && apt install -y software-properties-common
RUN add-apt-repository ppa:oibaf/graphics-drivers -y 

RUN apt update && apt install -y \
    vainfo \
    mesa-va-drivers \ 


ENV LD_LIBRARY_PATH=/usr/lib/wsl/lib
CMD vainfo --display drm --device /dev/dri/card0

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    ca-certificates \
    cmake \
    git \
    librdmacm1 \
    libibverbs1 \
    ibverbs-providers \
    unzip \
    python3-pip \

RUN  add-apt-repository universe &&  apt install curl -y && \
    curl -sSL -o /usr/share/keyrings/ros-archive-keyring.gpg  && \
    echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] $(. /etc/os-release && echo $UBUNTU_CODENAME) main" |  tee /etc/apt/sources.list.d/ros2.list > /dev/null
RUN    apt update && apt upgrade   -y --no-install-recommends
RUN    apt install  -y --no-install-recommends ros-humble-desktop 

RUN pip3 install torch torchvision torchaudio --index-url
RUN wget
RUN unzip -d /torch
ENV LD_LIBRARY_PATH     $LD_LIBRARY_PATH:/torch/libtorch/lib

RUN apt-get install udev curl git libboost-dev libgeographic-dev libopencv-dev pybind11-dev python-is-python3 python3-opencv python3-pip python3-pykdl ros-dev-tools ros-humble-ament-cmake-nose ros-humble-geographic-msgs ros-humble-grid-map ros-humble-nav2-bringup ros-humble-vision-opencv software-properties-common unzip   wget ros-humble-rosbag2-storage-mcap -y

#installing realsense
RUN mkdir -p /home/source
WORKDIR /home/source
RUN git clone
WORKDIR /home/source/librealsense
RUN mkdir -p /etc/udev/rules.d
RUN ./scripts/
RUN mkdir -p /home/source/librealsense/build 
WORKDIR /home/source/librealsense/build 
RUN cmake ../ -DCMAKE_CXX_STANDARD=17 -DCMAKE_BUILD_TYPE=Release &&   make -j &&  make install
RUN pip install --user colorama pyrealsense2

#install CycloneDDS and Nav2
RUN apt install  ros-humble-navigation2 ros-humble-nav2-bringup libeigen3-dev -y

# Build and install OpenGV without march=native option
RUN git clone && \
    cd opengv && \ 
    git checkout 91f4b19c73450833a40e463ad3648aae80b3a7f3 && \
    wget && \
    git apply opengv_disable_march_native.patch && \
    mkdir build && \
    cd build && \
    cmake -DCMAKE_CXX_STANDARD=17 -DCMAKE_BUILD_TYPE=Release .. && \
    make -j && \
    make install && \
    cd ../.. &&    rm opengv -rf

# Build latest gtsam
WORKDIR /gtsam
RUN git clone && \
 cd gtsam && git checkout adc438922017d7ca986e03d7d6db35cb0134817b && \
 mkdir build && \
 cd build && \
 MAKEFLAGS=-j cmake --build . --config Release --target install -j && \
 cd ../..

RUN git clone && \
    cd g2o && \
    git checkout 20230223_git && \
    mkdir build && \
    cd build && \
    make -j 8 && \
    make install 

#Installing additional utilities    
RUN apt-get install terminator htop nano vim -y
ENV RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
RUN  apt-get install ros-humble-rmw-cyclonedds-cpp ros-humble-rtabmap-viz -y

WORKDIR /workspace
# OpenCV with xfeatures2d and nonfree modules
RUN git clone && \
    git clone && \
    cd opencv_contrib && \
    git checkout tags/4.7.0 && \
    cd /workspace && \
    cd opencv && \
    git checkout tags/4.7.0 && \
    mkdir build && \
    cd build && \
    make -j$(nproc) && \
    make install 
    #&& \
    #cd /workspace && \
    #rm -rf opencv opencv_contrib

#Run container, mount rtabmap code to /root/ws  and call call colcon from it:
#RUN colcon build  --symlink-install --cmake-args -DRTABMAP_SYNC_MULTI_RGBD=ON -DWITH_OPENGV=ON -DCMAKE_BUILD_TYPE=Debug  -DCMAKE_EXPORT_COMPILE_COMMANDS=1   -DCMAKE_CXX_STANDARD=17 -D WITH_TORCH=ON        -D WITH_PYTHON=ON -D Torch_DIR=/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/ -DCMAKE_CXX_FLAGS="-I/usr/include/python3.10/" 

Build command

Start container while mapping rtabmap repo to /root/ws/src/rtabmap. call: source /opt/ros/humble/setup.bash

From the folder /root/ws/src/rtabmap/build call:

   CXXFLAGS="-I/usr/include/python3.10/" \
          -D WITH_PYTHON=ON \
          -D WITH_OPENGV=ON \
          -D Torch_DIR=/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/  \
        .. && make -j

or from the folder /root/ws/src/ which besides rtabmap also could hold rtabmap_ros call:

colcon build  --symlink-install --cmake-args -DRTABMAP_SYNC_MULTI_RGBD=ON -DWITH_OPENGV=ON -DCMAKE_BUILD_TYPE=Debug  -DCMAKE_EXPORT_COMPILE_COMMANDS=1   -DCMAKE_CXX_STANDARD=17 -D WITH_TORCH=ON        -D WITH_PYTHON=ON -D Torch_DIR=/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/ -DCMAKE_CXX_FLAGS="-I/usr/include/python3.10/" 

Output: Rtabmap build configuration:

-- Info :
--   RTAB-Map Version =     0.21.1
--   CMAKE_VERSION =        3.25.0
--   CMAKE_INSTALL_PREFIX = /usr/local
--   CMAKE_BUILD_TYPE =     Release
--   BUILD_APP =            ON
--   BUILD_TOOLS =          ON
--   BUILD_EXAMPLES =       ON
--   CMAKE_CXX_FLAGS = -I/usr/include/python3.10/ -fmessage-length=0 -fopenmp -std=c++17
--   PCL_VERSION = 1.12.1
-- Optional dependencies ('*' affects some default parameters) :
--  *With OpenCV 4.7.0 xfeatures2d = YES, nonfree = YES (License: Non commercial)
--   With Qt 5.15.3            = YES (License: Open Source or Commercial)
--   With VTK 9.1              = YES (License: BSD)
--   With external SQLite3     = YES (License: Public Domain)
--   With ORB OcTree           = YES (License: GPLv3)
--   With SupertPoint          = YES (License: GPLv3) libtorch=2.0.1
--   With Python3.10            = YES (License: PSF)
--   With Madgwick             = YES (License: GPL)
--   With FastCV               = NO (FastCV not found)
--   With PDAL                 = NO (PDAL not found)
--  Solvers:
--   With TORO                 = YES (License: Creative Commons [Attribution-NonCommercial-ShareAlike])
--  *With g2o 1.0.0            = YES (License: BSD)
--  *With GTSAM 4.3.0          = YES (License: BSD)
--  *With Ceres                = NO (WITH_CERES=OFF)
--   With MRPT                 = NO (MRPT not found)
--   With VERTIGO              = YES (License: GPLv3)
--   With cvsba                = NO (WITH_CVSBA=OFF)
--  *With libpointmatcher 1.3.1 = YES (License: BSD)
--   With CCCoreLib            = NO (WITH_CCCORELIB=OFF)
--   With Open3D               = NO (WITH_OPEN3D=OFF)
--   With OpenGV 1.0           = YES (License: BSD)
--  Reconstruction Approaches:
--   With OCTOMAP 1.9.8        = YES (License: BSD)
--   With CPUTSDF              = NO (WITH_CPUTSDF=OFF)
--   With OpenChisel           = NO (WITH_OPENCHISEL=OFF)
--   With AliceVision          = NO (WITH_ALICE_VISION=OFF)
--  Camera Drivers:
--   With Freenect             = YES (License: Apache v2 and/or GPLv2)
--   With OpenNI2              = YES (License: Apache v2)
--   With Freenect2            = NO (libfreenect2 not found)
--   With Kinect for Windows 2 = NO (Kinect for Windows 2 SDK not found)
--   With Kinect for Azure     = NO (Kinect for Azure SDK not found)
--   With dc1394               = YES (License: LGPL)
--   With FlyCapture2/Triclops = NO (Point Grey SDK not found)
--   With ZED                  = NO (ZED sdk and/or cuda not found)
--   With ZEDOC                = NO (ZED Open Capture not found)
--   With RealSense            = NO (librealsense not found)
--   With RealSense2 2.51.1    = YES (License: Apache-2)
--   With MyntEyeS             = NO (mynteye s sdk not found)
--   With DepthAI              = NO (WITH_DEPTHAI=OFF)
--  Odometry Approaches:
--   With loam_velodyne        = NO (WITH_LOAM=OFF)
--   With floam                = NO (WITH_FLOAM=OFF)
--   With libfovis             = NO (WITH_FOVIS=OFF)
--   With libviso2             = NO (WITH_VISO2=OFF)
--   With dvo_core             = NO (WITH_DVO=OFF)
--   With okvis                = NO (WITH_OKVIS=OFF)
--   With msckf_vio            = NO (WITH_MSCKF_VIO=OFF)
--   With VINS-Fusion          = NO (WITH_VINS=OFF)
--   With OpenVINS             = NO (WITH_OPENVINS=OFF)
--   With ORB_SLAM             = NO (WITH_ORB_SLAM=OFF)
-- Show all options with: cmake -LA | grep WITH_
-- --------------------------------------------

Build error messages

[ 43%] Linking CXX shared library ../../bin/
[ 79%] Built target rtabmap_gui
[ 79%] Building CXX object app/src/CMakeFiles/rtabmap_app.dir/main.cpp.o
[ 80%] Linking CXX executable ../../bin/rtabmap
/usr/bin/ld: CMakeFiles/rtabmap_app.dir/main.cpp.o: in function `main':
main.cpp:(.text.startup+0xd0): undefined reference to `rtabmap::Parameters::parseArguments[abi:cxx11](int, char**, bool)'
/usr/bin/ld: ../../bin/ undefined reference to `uStr2Float(std::string const&)'
/usr/bin/ld: ../../bin/ undefined reference to `clams::DiscreteDepthDistortionModel::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ../../bin/ undefined reference to `rtabmap::CameraK4A::CameraK4A(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, float, rtabmap::Transform const&)'


raits<char>, std::allocator<char> > const&, cv::Size_<int> const&, cv::Mat const&, cv::Mat const&, cv::Mat const&, cv::Mat const&, rtabmap::Transform const&)'
/usr/bin/ld: ../../bin/ undefined reference to `PointMatcher<float>::Matcher::Matcher(std::string const&, std::vector<PointMatcherSupport::Parametrizable::ParameterDoc, std::allocator<PointMatcherSupport::Parametrizable::ParameterDoc> >, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&)'
/usr/bin/ld: ../../bin/ undefined reference to `pcl::search::Search<pcl::PointXYZINormal>::getName() const'
/usr/bin/ld: ../../bin/ undefined reference to `PointMatcher<float>::DataPoints::getFeatureViewByName(std::string const&)'
/usr/bin/ld: ../../bin/ undefined reference to `UDirectory::homeDir()'
/usr/bin/ld: ../../bin/ undefined reference to `PointMatcherSupport::InvalidElement::InvalidElement(std::string const&)'
/usr/bin/ld: ../../bin/ undefined reference to `uBool2Str(bool)'
collect2: error: ld returned 1 exit status
make[2]: *** [app/src/CMakeFiles/rtabmap_app.dir/build.make:192: bin/rtabmap] Error 1
make[1]: *** [CMakeFiles/Makefile2:881: app/src/CMakeFiles/rtabmap_app.dir/all] Error 2
make: *** [Makefile:156: all] Error 2
matlabbe commented 1 year ago

Got same undefined reference errors yesterday on amlmost all rtabmap's core functions. Not sure why, maybe pytorch is packaged with difference version of standard libraries.. As a workaround I built pytorch from source. The build passes, but got this new issue when running the code. I'll try later with an older pytorch version.

Otherwise for your case, as you are using a Dockerfile, I'll suggest to start from pytorch base image (FROM directly like in this example: And like explained in this other post, you can then install ros in it.

mattiasmar commented 1 year ago

How do I install ROS2 Humble in the frontiers2022 image? That dockerfile holds Ubuntu 20, and as far as I can tell ROS2 requires Ubuntu22. Please correct me if I'm wrong.

matlabbe commented 1 year ago

Oh yeah, I read too quickly the name of the image FROM, and based on my post, it would be indeed 20.04. You can install ros2 foxy on 20.04. If you need humble + 22.04, you may check if nvidia has already a pytorch image, otherwise you would need to do like what you did. To go around undefined reference errors, you may have to rebuild pytorch from source though. This is how I installed pytorch on my computer:

git clone
git clone
cd pytorch
python3 install
cd ..
cd vision
python3 install

If you want fixed version, you may check at this table:

mattiasmar commented 1 year ago

I'm building now with pytorch 1.13.1. Will report on results once ready. Building from source like you wrote compiles but fails during runtime, as in #1064, correct?

matlabbe commented 1 year ago

Yes, with latest pytorch version at least. And those lines can be useful to regenerate the superpoint model based on your pytorch version:

Let me know if you don't get a seg fault with 1.13.1.

mattiasmar commented 1 year ago

I tried with pytorch 1.13.1 and 1.12.0. Both gave me build (linkage) errors.

#PyTorch 1.13.1
RUN pip3 install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url
RUN wget
RUN unzip -d /torch
#PyTorch 1.12.0
RUN pip3 install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url
RUN wget
RUN unzip -d /torch
mattiasmar commented 1 year ago

Testing humble with pytorch built from source + Testing foxy on nvidia's pytorch/ubuntu20 image. Results ready by tomorrow.

What will I loose when moving from humble to foxy? Anything special you would want to make a Rtabmap user aware of?

matlabbe commented 1 year ago

From rtabmap point of view, it is the same code on foxy and humble. Note also I did fix the seg fault I had, so confirming it is also working with latest pytorch (2.1).

mattiasmar commented 1 year ago

Good news: dockerimage based on nvidia/cuda:11.8.0-devel-ubuntu22.04, with pytorch compiled from source allowed to build rtabmap. Yet to test this and then continue towards rtabmap_ros.

3 questions: When you said pytorch 2.1 you really meant 2.0.1, right? Does your existing frontiers image build the master branch for you? Does that colcon build of rtabmap_ros work for you in that image too?

mattiasmar commented 1 year ago

Things look indeed much better now. In terms of compilation the only issue that seems to remain for me now is that unless I in rtabmap/corelib/src/CMakeLists.txt under the IF(WITH_PYTHON AND Python3_FOUND) clause (line 209) use LIBRARIES instead of PUBLIC_LIBRARIES (the current state in the master) I get this build error:

Starting >>> rtabmap_conversions
--- stderr: rtabmap_conversions
** WARNING ** io features related to pcap will be disabled
** WARNING ** io features related to png will be disabled
CMake Error at CMakeLists.txt:53 (add_library):
  Target "rtabmap_conversions" links to target "Python3::Python" but the
  target was not found.  Perhaps a find_package() call is missing for an
  IMPORTED target, or an ALIAS target is missing?

CMake Error at CMakeLists.txt:53 (add_library):
  Target "rtabmap_conversions" links to target "Python3::NumPy" but the
  target was not found.  Perhaps a find_package() call is missing for an
  IMPORTED target, or an ALIAS target is missing?

I'm building using this command: colcon build --symlink-install --cmake-args -DRTABMAP_SYNC_MULTI_RGBD=ON -DWITH_OPENGV=ON -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=1 -DCMAKE_CXX_STANDARD=17 -D WITH_TORCH=ON -D WITH_PYTHON=ON -D Torch_DIR=/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/ -DCMAKE_CXX_FLAGS="-I/usr/include/python3.10/"

These line of CMakeLists was changed by this commit

matlabbe commented 1 year ago

I fixed the python dep missing in commit above.

For the frontiers docker image, there is a COPY . rtabmap, if you cloned the master version, it will be master.

For pytorch, the latest master seems to be 2.1.0:

~/.local/lib/python3.8/site-packages/torch/share/cmake/Torch$ cat TorchConfigVersion.cmake 
set(PACKAGE_VERSION "2.1.0")

I didn't try colcon on that image, only noetic catkin. I don't see why colcon would not work. Make sure to source setup.bash of ros before doing colcon.

mattiasmar commented 1 year ago

Thanks for the help. Rtabmap now builds and runs with torch. Still don't understand the importance of recompiling pytorch, but it does the trick.

matlabbe commented 1 year ago

I cannot find the post, but someone said recently that it was because pytorch binaries were built with a different version of cpp library.

EDIT: found the post: