introlab / rtabmap

RTAB-Map library and standalone application
https://introlab.github.io/rtabmap
Other
2.78k stars 785 forks source link

RTabMap + Humble + Torch= True? #1063

Closed mattiasmar closed 1 year ago

mattiasmar commented 1 year ago

Is there a known problem with compiling Rtabmap against ROS2 (humble) together with Torch? I've created a Dockerfile with the latest and greatest dependencies (as far as I could tell from reading a lot of issues and commits in this repo), nevertheless when I compile the rtabmap master branch with TORCH=ON I get a lot of linker errors (undefined reference to...). The rtabmap build configuration is copied below. It looks fine to me, but maybe it isn't.

I would be very grateful for an Ubuntu 22 Dockerfile with ROS2 Humble installation inside that can successfully compile the Rtabmap master branch with Torch enabled.

Dockerfile

FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04 as runtime

ARG DEBIAN_FRONTEND=noninteractive

# Uncomment the lines below to use a 3rd party repository
# to get the latest (unstable from mesa/main) mesa library version
RUN apt update && apt install -y software-properties-common
RUN add-apt-repository ppa:oibaf/graphics-drivers -y 

RUN apt update && apt install -y \
    vainfo \
    mesa-va-drivers \ 
    mesa-utils

ENV LIBVA_DRIVER_NAME=d3d12
ENV NVIDIA_DRIVER_CAPABILITIES all
ENV MESA_D3D12_DEFAULT_ADAPTER_NAME=NVIDIA

ENV LD_LIBRARY_PATH=/usr/lib/wsl/lib
CMD vainfo --display drm --device /dev/dri/card0

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    ca-certificates \
    cmake \
    git \
    librdmacm1 \
    libibverbs1 \
    ibverbs-providers \
    unzip \
    python3-pip \
    wget

RUN  add-apt-repository universe &&  apt install curl -y && \
    curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key -o /usr/share/keyrings/ros-archive-keyring.gpg  && \
    echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] http://packages.ros.org/ros2/ubuntu $(. /etc/os-release && echo $UBUNTU_CODENAME) main" |  tee /etc/apt/sources.list.d/ros2.list > /dev/null
RUN    apt update && apt upgrade   -y --no-install-recommends
RUN    apt install  -y --no-install-recommends ros-humble-desktop 

RUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
RUN wget https://download.pytorch.org/libtorch/cu118/libtorch-shared-with-deps-2.0.1%2Bcu118.zip
RUN unzip libtorch-shared-with-deps-2.0.1+cu118.zip -d /torch
ENV LD_LIBRARY_PATH     $LD_LIBRARY_PATH:/torch/libtorch/lib

RUN apt-get install udev curl git libboost-dev libgeographic-dev libopencv-dev pybind11-dev python-is-python3 python3-opencv python3-pip python3-pykdl ros-dev-tools ros-humble-ament-cmake-nose ros-humble-geographic-msgs ros-humble-grid-map ros-humble-nav2-bringup ros-humble-vision-opencv software-properties-common unzip   wget ros-humble-rosbag2-storage-mcap -y

#installing realsense
RUN mkdir -p /home/source
WORKDIR /home/source
RUN git clone https://github.com/IntelRealSense/librealsense.git
WORKDIR /home/source/librealsense
RUN mkdir -p /etc/udev/rules.d
RUN ./scripts/setup_udev_rules.sh
RUN mkdir -p /home/source/librealsense/build 
WORKDIR /home/source/librealsense/build 
RUN cmake ../ -DCMAKE_CXX_STANDARD=17 -DCMAKE_BUILD_TYPE=Release &&   make -j &&  make install
RUN pip install --user colorama pyrealsense2

#install CycloneDDS and Nav2
RUN apt install  ros-humble-navigation2 ros-humble-nav2-bringup libeigen3-dev -y

# Build and install OpenGV without march=native option
RUN git clone https://github.com/laurentkneip/opengv.git && \
    cd opengv && \ 
    git checkout 91f4b19c73450833a40e463ad3648aae80b3a7f3 && \
    wget https://gist.githubusercontent.com/matlabbe/a412cf7c4627253874f81a00745a7fbb/raw/accc3acf465d1ffd0304a46b17741f62d4d354ef/opengv_disable_march_native.patch && \
    git apply opengv_disable_march_native.patch && \
    mkdir build && \
    cd build && \
    cmake -DCMAKE_CXX_STANDARD=17 -DCMAKE_BUILD_TYPE=Release .. && \
    make -j && \
    make install && \
    cd ../.. &&    rm opengv -rf

# Build latest gtsam
WORKDIR /gtsam
RUN git clone https://github.com/borglab/gtsam.git && \
 cd gtsam && git checkout adc438922017d7ca986e03d7d6db35cb0134817b && \
 mkdir build && \
 cd build && \
 cmake -DCMAKE_CXX_STANDARD=17 -DGTSAM_BUILD_EXAMPLES_ALWAYS=OFF -DGTSAM_BUILD_TESTS=OFF -DGTSAM_BUILD_STATIC_LIBRARY=OFF -DGTSAM_BUILD_UNSTABLE=OFF -DGTSAM_INSTALL_CPPUNILITE=OFF -DGTSAM_USE_SYSTEM_EIGEN=ON .. && \
 MAKEFLAGS=-j cmake --build . --config Release --target install -j && \
 cd ../..

WORKDIR /g2o
RUN git clone https://github.com/RainerKuemmerle/g2o.git && \
    cd g2o && \
    git checkout 20230223_git && \
    mkdir build && \
    cd build && \
    cmake -DCMAKE_CXX_STANDARD=17 -DBUILD_LGPL_SHARED_LIBS=ON -DG2O_BUILD_APPS=OFF -DBUILD_WITH_MARCH_NATIVE=OFF -DG2O_BUILD_EXAMPLES=OFF -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=Release .. && \
    make -j 8 && \
    make install 

#Installing additional utilities    
RUN apt-get install terminator htop nano vim -y
ENV RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
RUN  apt-get install ros-humble-rmw-cyclonedds-cpp ros-humble-rtabmap-viz -y

WORKDIR /workspace
# OpenCV with xfeatures2d and nonfree modules
RUN git clone https://github.com/opencv/opencv_contrib.git && \
    git clone https://github.com/opencv/opencv.git && \
    cd opencv_contrib && \
    git checkout tags/4.7.0 && \
    cd /workspace && \
    cd opencv && \
    git checkout tags/4.7.0 && \
    mkdir build && \
    cd build && \
    cmake -DCMAKE_CXX_STANDARD=17 -DOPENCV_EXTRA_MODULES_PATH=/workspace/opencv_contrib/modules -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DBUILD_TESTS=OFF -DBUILD_PERF_TESTS=OFF -DOPENCV_ENABLE_NONFREE=ON .. && \
    make -j$(nproc) && \
    make install 
    #&& \
    #cd /workspace && \
    #rm -rf opencv opencv_contrib

#Run container, mount rtabmap code to /root/ws  and call call colcon from it:
#RUN colcon build  --symlink-install --cmake-args -DRTABMAP_SYNC_MULTI_RGBD=ON -DWITH_OPENGV=ON -DCMAKE_BUILD_TYPE=Debug  -DCMAKE_EXPORT_COMPILE_COMMANDS=1   -DCMAKE_CXX_STANDARD=17 -D WITH_TORCH=ON        -D WITH_PYTHON=ON -D Torch_DIR=/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/ -DCMAKE_CXX_FLAGS="-I/usr/include/python3.10/" 

Build command

Start container while mapping rtabmap repo to /root/ws/src/rtabmap. call: source /opt/ros/humble/setup.bash

From the folder /root/ws/src/rtabmap/build call:

   CXXFLAGS="-I/usr/include/python3.10/" \
    cmake -D CMAKE_CXX_STANDARD=17 -D WITH_TORCH=ON \
          -D WITH_PYTHON=ON \
          -D WITH_OPENGV=ON \
          -D Torch_DIR=/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/  \
        .. && make -j

or from the folder /root/ws/src/ which besides rtabmap also could hold rtabmap_ros call:

colcon build  --symlink-install --cmake-args -DRTABMAP_SYNC_MULTI_RGBD=ON -DWITH_OPENGV=ON -DCMAKE_BUILD_TYPE=Debug  -DCMAKE_EXPORT_COMPILE_COMMANDS=1   -DCMAKE_CXX_STANDARD=17 -D WITH_TORCH=ON        -D WITH_PYTHON=ON -D Torch_DIR=/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/ -DCMAKE_CXX_FLAGS="-I/usr/include/python3.10/" 

Output: Rtabmap build configuration:

-- Info :
--   RTAB-Map Version =     0.21.1
--   CMAKE_VERSION =        3.25.0
--   CMAKE_INSTALL_PREFIX = /usr/local
--   CMAKE_BUILD_TYPE =     Release
--   CMAKE_INSTALL_LIBDIR = lib
--   BUILD_APP =            ON
--   BUILD_TOOLS =          ON
--   BUILD_EXAMPLES =       ON
--   BUILD_SHARED_LIBS =    ON
--   CMAKE_CXX_FLAGS = -I/usr/include/python3.10/ -fmessage-length=0 -fopenmp -std=c++17
--   FLANN_KDTREE_MEM_OPT = OFF
--   PCL_DEFINITIONS = -DDISABLE_OPENNI2;-DDISABLE_PCAP;-DDISABLE_PNG
--   PCL_VERSION = 1.12.1
--
-- Optional dependencies ('*' affects some default parameters) :
--  *With OpenCV 4.7.0 xfeatures2d = YES, nonfree = YES (License: Non commercial)
--   With Qt 5.15.3            = YES (License: Open Source or Commercial)
--   With VTK 9.1              = YES (License: BSD)
--   With external SQLite3     = YES (License: Public Domain)
--   With ORB OcTree           = YES (License: GPLv3)
--   With SupertPoint          = YES (License: GPLv3) libtorch=2.0.1
--   With Python3.10            = YES (License: PSF)
--   With Madgwick             = YES (License: GPL)
--   With FastCV               = NO (FastCV not found)
--   With PDAL                 = NO (PDAL not found)
--
--  Solvers:
--   With TORO                 = YES (License: Creative Commons [Attribution-NonCommercial-ShareAlike])
--  *With g2o 1.0.0            = YES (License: BSD)
--  *With GTSAM 4.3.0          = YES (License: BSD)
--  *With Ceres                = NO (WITH_CERES=OFF)
--   With MRPT                 = NO (MRPT not found)
--   With VERTIGO              = YES (License: GPLv3)
--   With cvsba                = NO (WITH_CVSBA=OFF)
--  *With libpointmatcher 1.3.1 = YES (License: BSD)
--   With CCCoreLib            = NO (WITH_CCCORELIB=OFF)
--   With Open3D               = NO (WITH_OPEN3D=OFF)
--   With OpenGV 1.0           = YES (License: BSD)
--
--  Reconstruction Approaches:
--   With OCTOMAP 1.9.8        = YES (License: BSD)
--   With CPUTSDF              = NO (WITH_CPUTSDF=OFF)
--   With OpenChisel           = NO (WITH_OPENCHISEL=OFF)
--   With AliceVision          = NO (WITH_ALICE_VISION=OFF)
--
--  Camera Drivers:
--   With Freenect             = YES (License: Apache v2 and/or GPLv2)
--   With OpenNI2              = YES (License: Apache v2)
--   With Freenect2            = NO (libfreenect2 not found)
--   With Kinect for Windows 2 = NO (Kinect for Windows 2 SDK not found)
--   With Kinect for Azure     = NO (Kinect for Azure SDK not found)
--   With dc1394               = YES (License: LGPL)
--   With FlyCapture2/Triclops = NO (Point Grey SDK not found)
--   With ZED                  = NO (ZED sdk and/or cuda not found)
--   With ZEDOC                = NO (ZED Open Capture not found)
--   With RealSense            = NO (librealsense not found)
--   With RealSense2 2.51.1    = YES (License: Apache-2)
--   With MyntEyeS             = NO (mynteye s sdk not found)
--   With DepthAI              = NO (WITH_DEPTHAI=OFF)
--
--  Odometry Approaches:
--   With loam_velodyne        = NO (WITH_LOAM=OFF)
--   With floam                = NO (WITH_FLOAM=OFF)
--   With libfovis             = NO (WITH_FOVIS=OFF)
--   With libviso2             = NO (WITH_VISO2=OFF)
--   With dvo_core             = NO (WITH_DVO=OFF)
--   With okvis                = NO (WITH_OKVIS=OFF)
--   With msckf_vio            = NO (WITH_MSCKF_VIO=OFF)
--   With VINS-Fusion          = NO (WITH_VINS=OFF)
--   With OpenVINS             = NO (WITH_OPENVINS=OFF)
--   With ORB_SLAM             = NO (WITH_ORB_SLAM=OFF)
-- Show all options with: cmake -LA | grep WITH_
-- --------------------------------------------

Build error messages

[ 43%] Linking CXX shared library ../../bin/librtabmap_gui.so
[ 79%] Built target rtabmap_gui
[ 79%] Building CXX object app/src/CMakeFiles/rtabmap_app.dir/main.cpp.o
[ 80%] Linking CXX executable ../../bin/rtabmap
/usr/bin/ld: CMakeFiles/rtabmap_app.dir/main.cpp.o: in function `main':
main.cpp:(.text.startup+0xd0): undefined reference to `rtabmap::Parameters::parseArguments[abi:cxx11](int, char**, bool)'
/usr/bin/ld: ../../bin/librtabmap_core.so.0.21.1: undefined reference to `uStr2Float(std::string const&)'
/usr/bin/ld: ../../bin/librtabmap_gui.so.0.21.1: undefined reference to `clams::DiscreteDepthDistortionModel::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
/usr/bin/ld: ../../bin/librtabmap_gui.so.0.21.1: undefined reference to `rtabmap::CameraK4A::CameraK4A(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, float, rtabmap::Transform const&)'

...

raits<char>, std::allocator<char> > const&, cv::Size_<int> const&, cv::Mat const&, cv::Mat const&, cv::Mat const&, cv::Mat const&, rtabmap::Transform const&)'
/usr/bin/ld: ../../bin/librtabmap_core.so.0.21.1: undefined reference to `PointMatcher<float>::Matcher::Matcher(std::string const&, std::vector<PointMatcherSupport::Parametrizable::ParameterDoc, std::allocator<PointMatcherSupport::Parametrizable::ParameterDoc> >, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&)'
/usr/bin/ld: ../../bin/librtabmap_core.so.0.21.1: undefined reference to `pcl::search::Search<pcl::PointXYZINormal>::getName() const'
/usr/bin/ld: ../../bin/librtabmap_core.so.0.21.1: undefined reference to `PointMatcher<float>::DataPoints::getFeatureViewByName(std::string const&)'
/usr/bin/ld: ../../bin/librtabmap_core.so.0.21.1: undefined reference to `UDirectory::homeDir()'
/usr/bin/ld: ../../bin/librtabmap_core.so.0.21.1: undefined reference to `PointMatcherSupport::InvalidElement::InvalidElement(std::string const&)'
/usr/bin/ld: ../../bin/librtabmap_core.so.0.21.1: undefined reference to `uBool2Str(bool)'
collect2: error: ld returned 1 exit status
make[2]: *** [app/src/CMakeFiles/rtabmap_app.dir/build.make:192: bin/rtabmap] Error 1
make[1]: *** [CMakeFiles/Makefile2:881: app/src/CMakeFiles/rtabmap_app.dir/all] Error 2
make: *** [Makefile:156: all] Error 2
matlabbe commented 1 year ago

Got same undefined reference errors yesterday on amlmost all rtabmap's core functions. Not sure why, maybe pytorch is packaged with difference version of standard libraries.. As a workaround I built pytorch from source. The build passes, but got this new issue https://github.com/introlab/rtabmap/issues/1064 when running the code. I'll try later with an older pytorch version.

Otherwise for your case, as you are using a Dockerfile, I'll suggest to start from pytorch base image (FROM nvcr.io/nvidia/pytorch:22.08-py3) directly like in this example: https://github.com/introlab/rtabmap/blob/master/docker/frontiers2022/Dockerfile. And like explained in this other post, you can then install ros in it.

mattiasmar commented 1 year ago

How do I install ROS2 Humble in the frontiers2022 image? That dockerfile holds Ubuntu 20, and as far as I can tell ROS2 requires Ubuntu22. Please correct me if I'm wrong.

matlabbe commented 1 year ago

Oh yeah, I read too quickly the name of the image FROM nvcr.io/nvidia/pytorch:22.08-py3, and based on my post, it would be indeed 20.04. You can install ros2 foxy on 20.04. If you need humble + 22.04, you may check if nvidia has already a pytorch image, otherwise you would need to do like what you did. To go around undefined reference errors, you may have to rebuild pytorch from source though. This is how I installed pytorch on my computer:

git clone https://github.com/pytorch/pytorch
git clone https://github.com/pytorch/vision
cd pytorch
python3 setup.py install
cd ..
cd vision
python3 setup.py install

If you want fixed version, you may check at this table: https://github.com/pytorch/vision#installation

mattiasmar commented 1 year ago

I'm building now with pytorch 1.13.1. Will report on results once ready. Building from source like you wrote compiles but fails during runtime, as in #1064, correct?

matlabbe commented 1 year ago

Yes, with latest pytorch version at least. And those lines can be useful to regenerate the superpoint model based on your pytorch version: https://github.com/introlab/rtabmap/blob/59d5675fe63e0b100954c73b0c4e2f0a8ac6a07e/docker/frontiers2022/Dockerfile#L58-L62

Let me know if you don't get a seg fault with 1.13.1.

mattiasmar commented 1 year ago

I tried with pytorch 1.13.1 and 1.12.0. Both gave me build (linkage) errors.

#PyTorch 1.13.1
RUN pip3 install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
RUN wget https://download.pytorch.org/libtorch/cu117/libtorch-shared-with-deps-1.13.1%2Bcu117.zip
RUN unzip libtorch-shared-with-deps-1.13.1+cu117.zip -d /torch
#PyTorch 1.12.0
RUN pip3 install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113
RUN wget https://download.pytorch.org/libtorch/cu111/libtorch-shared-with-deps-1.12.0%2Bcu113.zip
RUN unzip libtorch-shared-with-deps-1.12.0+cu113.zip -d /torch
mattiasmar commented 1 year ago

Testing humble with pytorch built from source + Testing foxy on nvidia's pytorch/ubuntu20 image. Results ready by tomorrow.

What will I loose when moving from humble to foxy? Anything special you would want to make a Rtabmap user aware of?

matlabbe commented 1 year ago

From rtabmap point of view, it is the same code on foxy and humble. Note also I did fix the seg fault I had, so confirming it is also working with latest pytorch (2.1).

mattiasmar commented 1 year ago

Good news: dockerimage based on nvidia/cuda:11.8.0-devel-ubuntu22.04, with pytorch compiled from source allowed to build rtabmap. Yet to test this and then continue towards rtabmap_ros.

3 questions: When you said pytorch 2.1 you really meant 2.0.1, right? Does your existing frontiers image build the master branch for you? Does that colcon build of rtabmap_ros work for you in that image too?

mattiasmar commented 1 year ago

Things look indeed much better now. In terms of compilation the only issue that seems to remain for me now is that unless I in rtabmap/corelib/src/CMakeLists.txt under the IF(WITH_PYTHON AND Python3_FOUND) clause (line 209) use LIBRARIES instead of PUBLIC_LIBRARIES (the current state in the master) I get this build error:

Starting >>> rtabmap_conversions
--- stderr: rtabmap_conversions
** WARNING ** io features related to pcap will be disabled
** WARNING ** io features related to png will be disabled
CMake Error at CMakeLists.txt:53 (add_library):
  Target "rtabmap_conversions" links to target "Python3::Python" but the
  target was not found.  Perhaps a find_package() call is missing for an
  IMPORTED target, or an ALIAS target is missing?

CMake Error at CMakeLists.txt:53 (add_library):
  Target "rtabmap_conversions" links to target "Python3::NumPy" but the
  target was not found.  Perhaps a find_package() call is missing for an
  IMPORTED target, or an ALIAS target is missing?

I'm building using this command: colcon build --symlink-install --cmake-args -DRTABMAP_SYNC_MULTI_RGBD=ON -DWITH_OPENGV=ON -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=1 -DCMAKE_CXX_STANDARD=17 -D WITH_TORCH=ON -D WITH_PYTHON=ON -D Torch_DIR=/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/ -DCMAKE_CXX_FLAGS="-I/usr/include/python3.10/"

These line of CMakeLists was changed by this commit

matlabbe commented 1 year ago

I fixed the python dep missing in commit above.

For the frontiers docker image, there is a COPY . rtabmap, if you cloned the master version, it will be master.

For pytorch, the latest master seems to be 2.1.0:

~/.local/lib/python3.8/site-packages/torch/share/cmake/Torch$ cat TorchConfigVersion.cmake 
set(PACKAGE_VERSION "2.1.0")

I didn't try colcon on that image, only noetic catkin. I don't see why colcon would not work. Make sure to source setup.bash of ros before doing colcon.

mattiasmar commented 1 year ago

Thanks for the help. Rtabmap now builds and runs with torch. Still don't understand the importance of recompiling pytorch, but it does the trick.

matlabbe commented 1 year ago

I cannot find the post, but someone said recently that it was because pytorch binaries were built with a different version of cpp library.

EDIT: found the post: https://github.com/introlab/rtabmap/issues/896#issuecomment-1602362071