DLR-RM / 3DObjectTracking

Algorithms and Publications on 3D Object Tracking
MIT License
624 stars 121 forks source link

Long execution time on medical robot object from the RTB dataset #54

Closed holesond closed 11 months ago

holesond commented 11 months ago

Although the articulated object tracking is very accurate, I failed to reproduce the expected 20.5 ms runtime of M3T (Mb-ICG) on the medical_robot object (sequence medical_robot/test_easy/000000) of the rtb_dataset. The evaluate_rtb_dataset program reported 470 ms execution_time on my PC (Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz, 64GB RAM, GPU NVIDIA GeForce RTX 2080 Ti):

$ time ./evaluate_rtb_dataset 
medical_robot_test_easy_000000_depth_azure_kinect: execution_time = 470010 us, add auc = 0.969548, adds auc = 0.985526

--------------------------------------------------------------------------------
medical_robot_test_easy: execution_time = 470010 us, add auc = 0.969548, adds auc = 0.985526
all_test_easy: execution_time = 470010 us, add auc = 0.969548, adds auc = 0.985526

--------------------------------------------------------------------------------
medical_robot_depth_azure_kinect: execution_time = 470010 us, add auc = 0.969548, adds auc = 0.985526
all_depth_azure_kinect: execution_time = 470010 us, add auc = 0.969548, adds auc = 0.985526

--------------------------------------------------------------------------------
medical_robot: execution_time = 470010 us, add auc = 0.969548, adds auc = 0.985526
all: execution_time = 470010 us, add auc = 0.969548, adds auc = 0.985526
--------------------------------------------------------------------------------

real    1m37.625s
user    1m50.722s
sys     0m2.127s 

I have modified only the evaluate_rtb_dataset.cpp source file as follows to let it run on the medical_robot/test_easy/000000 sequence only:

std::vector<std::string> object_names{"medical_robot"};
std::vector<std::string> difficulty_levels{"test_easy"};
std::vector<std::string> depth_names{"depth_azure_kinect"};
std::vector<int> sequence_numbers{0};

I have built M3T with the following cmake configuration:

$ cmake .. 
-- The C compiler identification is GNU 13.1.1
-- The CXX compiler identification is GNU 13.1.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found GLEW: /usr/include (found version "2.2.0") 
-- Found OpenGL: /usr/lib/libOpenGL.so   
CMake Warning (dev) at /usr/lib/cmake/opencv4/OpenCVConfig.cmake:86 (find_package):
  Policy CMP0146 is not set: The FindCUDA module is removed.  Run "cmake
  --help-policy CMP0146" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

Call Stack (most recent call first):
  /usr/lib/cmake/opencv4/OpenCVConfig.cmake:108 (find_host_package)
  CMakeLists.txt:35 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found CUDA: /opt/cuda (found suitable exact version "12.2") 
-- Found OpenCV: /usr (found suitable version "4.8.0", minimum required is "4.3.0") found components: core imgproc highgui imgcodecs calib3d features2d xfeatures2d cudafeatures2d 
-- Found OpenMP_C: -fopenmp (found version "4.5") 
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
-- Performing Test HAS_MARCH
-- Performing Test HAS_MARCH - Failed
-- Performing Test HAS_MTUNE
-- Performing Test HAS_MTUNE - Failed
-- Performing Test HAS_GGDB
-- Performing Test HAS_GGDB - Success
-- Performing Test HAS_Z7
-- Performing Test HAS_Z7 - Failed
-- Performing Test HAS_FTRAPV
-- Performing Test HAS_FTRAPV - Success
-- Performing Test HAS_OD
-- Performing Test HAS_OD - Failed
-- Performing Test HAS_OB3
-- Performing Test HAS_OB3 - Failed
-- Performing Test HAS_O3
-- Performing Test HAS_O3 - Success
-- Performing Test HAS_OB2
-- Performing Test HAS_OB2 - Failed
-- Performing Test HAS_O2
-- Performing Test HAS_O2 - Success
-- Found GTest: /usr/lib/cmake/GTest/GTestConfig.cmake (found suitable version "1.13.0", minimum required is "1.7.0")  
-- Found Doxygen: /usr/bin/doxygen (found version "1.9.7") found components: doxygen dot 
-- Configuring done (1.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/holesond/3DObjectTracking/M3T/build

Note that I have built M3T with OpenCV with CUDA support (cudafeatures2d). Nevertheless, the evaluate_rtb_dataset program does not use the GPU (according to nvidia-smi). It utilizes only a single CPU core when tracking the medical robot.

What should I do to approach the 20 ms runtime presented in Table 3 of the paper "A Multi-body Tracking Framework - From Rigid Objects to Kinematic Structures"?

holesond commented 11 months ago

I also tried a release build, cmake -DCMAKE_BUILD_TYPE=Release .., which reduced the execution time on medical robot to 210 ms. That is better than the 470 ms but still far from 20.5 ms.

holesond commented 11 months ago

OK, I have found what caused the long execution times. The M3T multi-body tracker uses OpenGL during tracking at least in the silhouette_renderer module. I was running evaluate_rtb_dataset on a remote computer via an ssh shell with X forwarding. That forced OpenGL to rely on a software render (driver swrast_dri.so) instead of a hardware renderer (integrated or discrete GPU), resulting in considerably slower computation.

When I launch the program from a desktop environment directly on the computer, I get execution times around 14 ms on the medical_robot/test_easy/000000 sequence, which is great. :-)