colmap / glomap

GLOMAP - Global Structured-from-Motion Revisited
BSD 3-Clause "New" or "Revised" License
1.3k stars 76 forks source link

Illegal instruction (core dumped) #55

Open Davidyao99 opened 1 month ago

Davidyao99 commented 1 month ago

I am trying to run glomap on cluster using singularity image and am facing the following error when running it on the sample gerrald-hall image with glomap mapper command.

I0810 06:28:25.486933 14316 relpose_estimation.cc:24] Estimating relative pose for 1317 pairs
 Estimating relative pose: 0%*** Aborted at 1723289305 (unix time) try "date -d @1723289305" if you are using GNU date ***
PC: @                0x0 (unknown)
*** SIGILL (@0x5614bc45ffb0) received by PID 14316 (TID 0x2b2c7bf256c0) from PID 18446744072573288368; stack trace: ***
    @     0x2b2c7a355046 (unknown)
    @     0x2b2c7a82b520 (unknown)
    @     0x5614bc45ffb0 poselib::Camera::focal()
    @     0x5614bc4639ac poselib::estimate_relative_pose()
    @     0x5614bc1ead9c _ZN6glomap21EstimateRelativePosesERNS_9ViewGraphERSt13unordered_mapIjNS_6CameraESt4hashIjESt8equal_toIjESaISt4pairIKjS3_EEERS2_IjNS_5ImageES5_S7_SaIS8_IS9_SE_EEERKNS_29RelativePoseEstimationOptionsE._omp_fn.0
    @     0x2b2c7a47ea16 GOMP_parallel
    @     0x5614bc1ea899 glomap::EstimateRelativePoses()
    @     0x5614bc18b901 glomap::GlobalMapper::Solve()
    @     0x5614bc18250c glomap::RunMapper()
    @     0x5614bc17edc5 main
    @     0x2b2c7a812d90 (unknown)
    @     0x2b2c7a812e40 __libc_start_main
    @     0x5614bc180bb5 _start
Illegal instruction (core dumped)

Following the docker image here, I created my own singularity image. However, due to some complications, I decided to not build colmap, and instead install it with conda environment. Any ideas on how should I resolve this? Some possible reasons for the issues are:

1) I am using colmap that is installed using conda env instead of building it from source

2) I am building this image on a cluster that does not have a gui

This is my singularity build file:

 # This could also be another Ubuntu or Debian based distribution
BootStrap:docker
From: nvidia/cuda:11.7.1-base-ubuntu22.04

# Install dependencies
%post
export QT_XCB_GL_INTEGRATION=xcb_egl
export DEBIAN_FRONTEND=noninteractive

apt-get update && apt-get install --no-install-recommends -y \
git \
build-essential \
cmake \
ninja-build \
wget \
unzip \
libboost-program-options-dev \
libboost-filesystem-dev \
libboost-graph-dev \
libboost-system-dev \
libeigen3-dev \
libsuitesparse-dev \
libceres-dev \
libflann-dev \
libfreeimage-dev \
libmetis-dev \
libgoogle-glog-dev \
libgtest-dev \
libsqlite3-dev \
libglew-dev \
qtbase5-dev \
libqt5opengl5-dev \
libcgal-dev \
libcgal-qt5-dev \
libgl1-mesa-dri \
libunwind-dev \
xvfb \
clang-format-14 \
python3 \
python3-pip && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*

wget https://github.com/Kitware/CMake/releases/download/v3.30.1/cmake-3.30.1-linux-x86_64.sh && \
chmod +x cmake-3.30.1-linux-x86_64.sh && \
./cmake-3.30.1-linux-x86_64.sh --skip-license --prefix=/usr/local

# Set up compiler environment
apt-get update && \
apt-get install -y \
clang-15 \
libomp-15-dev \
gcc-10 \
g++-10 \
nvidia-cuda-toolkit \
nvidia-cuda-toolkit-gcc && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
export CC=/usr/bin/gcc-10
export CXX=/usr/bin/g++-10
export CUDAHOSTCXX=/usr/bin/g++-10

# Build and install GLOMAP
git clone https://github.com/colmap/glomap.git && \
cd glomap && \
git fetch https://github.com/colmap/glomap.git main && \
git checkout FETCH_HEAD && \
mkdir build && \
cd build && \
cmake .. \
    -GNinja \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_INSTALL_PREFIX=/glomap_installed \
    -DCMAKE_CUDA_ARCHITECTURES=86 \
    -DSuiteSparse_CHOLMOD_LIBRARY="/usr/lib/x86_64-linux-gnu/libcholmod.so" \
    -DSuiteSparse_CHOLMOD_INCLUDE_DIR="/usr/include/suitesparse" \
    -DTESTS_ENABLED=ON \
    -DASAN_ENABLED=false && \
ninja install
cp -r /glomap_installed/* /usr/local/

%environment
export CC=/usr/bin/gcc-10
export CXX=/usr/bin/g++-10
export CUDAHOSTCXX=/usr/bin/g++-10
ahojnnes commented 1 month ago

Are you compiling on a different machine than where you run the binaries?

Up until a few days ago, PoseLib enabled -march=native flags by default, which will create problems when redistributing binaries. Based on my feedback, this was changed here: https://github.com/PoseLib/PoseLib/commit/d406c08022a88984278a80a9ef7a44d46b2e1f14.

We have an open PR in GLOMAP to consume those latest changes. Meanwhile, you can manually update the git commit hash for poselib in the cmake/FindDependencies.cmake file.

Davidyao99 commented 1 month ago

Thanks for the prompt response, I changed the commit hash in FindDependencies.cmake and rebuilt the image, but the issue still exists.

I believe it is compiled on a different machine since I am building the image remotely, before importing the image and running it on the cluster node.

ahojnnes commented 1 month ago

What camera model are you using? I am wondering whether it is caused by PoseLib not yet supporting the camera model.