Triangulation with known image poses creates a noisier reconstruction compared to unknown poses

A dataset of 200 images was collected with a calibrated camera along with corresponding camera poses using a robot.

First, features were detected and exhaustive matching was used to match features and create the database for the model.

Steps taken:

The image dataset was sparsely reconstructed without camera poses using GUI/mapper for reconstruction of scene up to scale.
A sparse model of the scene (was created manually) using obtained camera poses. Using the same feature matches database and images from (1), the model was reconstructed using point_triangulator followed by bundle_adjuster for global bundle adjustment.

The sparse reconstruction from step 2 is significantly noisier compared to step 1, see fig. for statistics. comparison

I can't quite figure what is wrong here, since additional information about the scene should ideally improve the reconstruction, not worsen it.

Checks completed:

The image poses and scene geometry was checked using Blender addon and they were accurate.
Using the image poses from step 1, another sparse model was manually created. This model was reconstructed with point_triangulator, returning the correct reconstruction. Therefore, the reconstruction converges for the trivial solution.
I compiled from source and added an IterativeLocalRefinement step after every triangulated image, as done in incremental_mapper. This did not improve the results.

Expected behavior: The reconstruction from step 2 should ideally be better or at least not worse than step 1 with more unknowns.

Desktop:

OS: Windows 10
- COLMAP: 3.6 stable with CUDA

Settings:

How did you obtain the robot poses? Do you have an accurate calibration between the robot and the camera? Are the camera and robot poses synchronized, or could there be some time offset?

How did you measure the accuracy of the robot poses? Did you look at the trajectory or did you obtain 3D scene geometry from them?

I have run into situations where camera trajectories estimated by visual-inertial odometry provide less accurate dense reconstruction than structure-from-motion poses as the former oversmooth the trajectory, which leads to less accurate triangulations.

You can try the following code, which alternates between bundle adjustment of the model and re-triangulation / track merging. Running this for a few iterations should help refine less accurate poses:

int RunIncrementalModelRefiner(int argc, char** argv) {
  std::string input_path;
  std::string output_path;
  std::string image_list_path;

  OptionManager options;
  options.AddDatabaseOptions();
  options.AddImageOptions();
  options.AddRequiredOption("input_path", &input_path);
  options.AddRequiredOption("output_path", &output_path);
  options.AddDefaultOption("image_list_path", &image_list_path);  
  options.AddMapperOptions();
  options.Parse(argc, argv);

  if (!ExistsDir(input_path)) {
    std::cerr << "ERROR: `input_path` is not a directory" << std::endl;
    return EXIT_FAILURE;
  }

  if (!ExistsDir(output_path)) {
    std::cerr << "ERROR: `output_path` is not a directory" << std::endl;
    return EXIT_FAILURE;
  }

  const auto& mapper_options = *options.mapper;

  PrintHeading1("Loading model");

  Reconstruction reconstruction;
  reconstruction.Read(input_path);

  // Loads the list of images for which the camera pose will be fixed.
  std::vector<image_t> image_ids_fixed_poses;
  if (!image_list_path.empty()) {
    const auto image_names = ReadTextFileLines(image_list_path);
    for (const std::string& image_name : image_names) {
      const Image* image = reconstruction.FindImageWithName(image_name);
      if (image != nullptr) {
        image_ids_fixed_poses.push_back(image->ImageId());
      }

    }
  }

  PrintHeading1("Loading database");

  DatabaseCache database_cache;

  {
    Timer timer;
    timer.Start();

    Database database(*options.database_path);

    const size_t min_num_matches =
        static_cast<size_t>(mapper_options.min_num_matches);
    database_cache.Load(database, min_num_matches,
                        mapper_options.ignore_watermarks,
                        mapper_options.image_names);

    std::cout << std::endl;
    timer.PrintMinutes();
  }

  std::cout << std::endl;

  IncrementalMapper mapper(&database_cache);
  mapper.BeginReconstruction(&reconstruction);

  CHECK_GE(reconstruction.NumRegImages(), 2)
      << "Need at least two images for refinement";

  const std::vector<image_t>& reg_image_ids = reconstruction.RegImageIds();

  //////////////////////////////////////////////////////////////////////////////
  // Iteratively run bundle adjustment and re-triangulate
  //////////////////////////////////////////////////////////////////////////////

  auto ba_options = mapper_options.GlobalBundleAdjustment();
  // ba_options.refine_focal_length = false;
  // ba_options.refine_principal_point = false;
  // ba_options.refine_extra_params = false;
  // ba_options.refine_extrinsics = false;

  // Configure bundle adjustment.
  BundleAdjustmentConfig ba_config;
  for (const image_t image_id : reg_image_ids) {
    ba_config.AddImage(image_id);
  }

  for (const image_t image_id : image_ids_fixed_poses) {
    ba_config.SetConstantPose(image_id);
  }

  // Fix 7-DOFs of the bundle adjustment problem.
  ba_config.SetConstantPose(reg_image_ids[0]);
  if (image_ids_fixed_poses.empty()) {
    ba_config.SetConstantTvec(reg_image_ids[1], {0});
  }

  for (int i = 0; i < mapper_options.ba_global_max_refinements; ++i) {   
    // Avoid degeneracies in bundle adjustment.
    reconstruction.FilterObservationsWithNegativeDepth();

    const size_t num_observations = reconstruction.ComputeNumObservations();

    PrintHeading1("Bundle adjustment");
    BundleAdjuster bundle_adjuster(ba_options, ba_config);
    CHECK(bundle_adjuster.Solve(&reconstruction));

    size_t num_changed_observations = 0;
    num_changed_observations += CompleteAndMergeTracks(mapper_options, &mapper);
    num_changed_observations += FilterPoints(mapper_options, &mapper);
    const double changed =
        static_cast<double>(num_changed_observations) / num_observations;
    std::cout << StringPrintf("  => Changed observations: %.6f", changed)
              << std::endl;
    if (changed < mapper_options.ba_global_max_refinement_change) {
      break;
    }
  }

  PrintHeading1("Extracting colors");
  reconstruction.ExtractColorsForAllImages(*options.image_path);

  const bool kDiscardReconstruction = false;
  mapper.EndReconstruction(kDiscardReconstruction);

  reconstruction.Write(output_path);

  return EXIT_SUCCESS;
}

@tsattler that worked incredibly well for my reconstruction, the mean reprojection error went from ~1.74 to ~0.74 and further to ~0.71 after some bundle adjustment parameters tweaking. Thanks a lot! Do you have some other tips that would be worth trying out?

I've been experimenting with mapper, point_triangulator and @tsattler suggested RunIncrementalModelRefiner during last few days and it seems that there's some bug or unintended behavior that causes point_triangulator and RunIncrementalModelRefiner to perform worse. To illustrate, when I run mapper on my set of images I am able to get mean reprojection error of 0.7443. However, when I run RunIncrementalModelRefiner on estimated poses from the aforementioned reconstruction created by mapper on the very same set of images and settings I get reprojection error of '0.8024'. So even when I supply poses that by default should lead to at least same level of reconstruction quality, the reconstruction quality actually decreases.

Furthermore, although the reprojection error increases by small amount, the GUI renderer seems to show that both reconstuctions from plain mapper and RunIncrementalModelRefiner are very similar, however, when I export these poses for another use it seems that poses from point_triangulator and RunIncrementalModelRefiner are way less accurate.

I wasn't able to find any significant differences between mapper mapper, point_triangulator and RunIncrementalModelRefiner in the triangulation part of the code but I guess I must be something missing, similar to https://github.com/colmap/colmap/issues/1094

@kondela Do you have code that you can share? I also wish to use COLMAP for pose refinement, but I'm not getting good results with what I've written.

My process is as follows. Given initial pose estimates and images:

Reconstruct the sparse map by running colmap feature_extractor
Run the feature matcher: colmap exhaustive_matcher
Triangulate the points with colmap point_triangulator
Then run the code @tsattler provided above for some number of iterations (I am using 10 iterations right now)

However, this process makes the pose estimates significantly worse. I am assessing correctness by doing a simple procrustes analysis with poses generated from a normal COLMAP run.

您可以尝试以下代码，它在模型的捆绑调整和重新三角测量/轨道合并之间交替。运行几次迭代应该有助于完善不太准确的姿势：

int RunIncrementalModelRefiner(int argc, char** argv) {
  std::string input_path;
  std::string output_path;
  std::string image_list_path;

  OptionManager options;
  options.AddDatabaseOptions();
  options.AddImageOptions();
  options.AddRequiredOption("input_path", &input_path);
  options.AddRequiredOption("output_path", &output_path);
  options.AddDefaultOption("image_list_path", &image_list_path);  
  options.AddMapperOptions();
  options.Parse(argc, argv);

  if (!ExistsDir(input_path)) {
    std::cerr << "ERROR: `input_path` is not a directory" << std::endl;
    return EXIT_FAILURE;
  }

  if (!ExistsDir(output_path)) {
    std::cerr << "ERROR: `output_path` is not a directory" << std::endl;
    return EXIT_FAILURE;
  }

  const auto& mapper_options = *options.mapper;

  PrintHeading1("Loading model");

  Reconstruction reconstruction;
  reconstruction.Read(input_path);

  // Loads the list of images for which the camera pose will be fixed.
  std::vector<image_t> image_ids_fixed_poses;
  if (!image_list_path.empty()) {
    const auto image_names = ReadTextFileLines(image_list_path);
    for (const std::string& image_name : image_names) {
      const Image* image = reconstruction.FindImageWithName(image_name);
      if (image != nullptr) {
        image_ids_fixed_poses.push_back(image->ImageId());
      }

    }
  }

  PrintHeading1("Loading database");

  DatabaseCache database_cache;

  {
    Timer timer;
    timer.Start();

    Database database(*options.database_path);

    const size_t min_num_matches =
        static_cast<size_t>(mapper_options.min_num_matches);
    database_cache.Load(database, min_num_matches,
                        mapper_options.ignore_watermarks,
                        mapper_options.image_names);

    std::cout << std::endl;
    timer.PrintMinutes();
  }

  std::cout << std::endl;

  IncrementalMapper mapper(&database_cache);
  mapper.BeginReconstruction(&reconstruction);

  CHECK_GE(reconstruction.NumRegImages(), 2)
      << "Need at least two images for refinement";

  const std::vector<image_t>& reg_image_ids = reconstruction.RegImageIds();

  //////////////////////////////////////////////////////////////////////////////
  // Iteratively run bundle adjustment and re-triangulate
  //////////////////////////////////////////////////////////////////////////////

  auto ba_options = mapper_options.GlobalBundleAdjustment();
  // ba_options.refine_focal_length = false;
  // ba_options.refine_principal_point = false;
  // ba_options.refine_extra_params = false;
  // ba_options.refine_extrinsics = false;

  // Configure bundle adjustment.
  BundleAdjustmentConfig ba_config;
  for (const image_t image_id : reg_image_ids) {
    ba_config.AddImage(image_id);
  }

  for (const image_t image_id : image_ids_fixed_poses) {
    ba_config.SetConstantPose(image_id);
  }

  // Fix 7-DOFs of the bundle adjustment problem.
  ba_config.SetConstantPose(reg_image_ids[0]);
  if (image_ids_fixed_poses.empty()) {
    ba_config.SetConstantTvec(reg_image_ids[1], {0});
  }

  for (int i = 0; i < mapper_options.ba_global_max_refinements; ++i) {   
    // Avoid degeneracies in bundle adjustment.
    reconstruction.FilterObservationsWithNegativeDepth();

    const size_t num_observations = reconstruction.ComputeNumObservations();

    PrintHeading1("Bundle adjustment");
    BundleAdjuster bundle_adjuster(ba_options, ba_config);
    CHECK(bundle_adjuster.Solve(&reconstruction));

    size_t num_changed_observations = 0;
    num_changed_observations += CompleteAndMergeTracks(mapper_options, &mapper);
    num_changed_observations += FilterPoints(mapper_options, &mapper);
    const double changed =
        static_cast<double>(num_changed_observations) / num_observations;
    std::cout << StringPrintf("  => Changed observations: %.6f", changed)
              << std::endl;
    if (changed < mapper_options.ba_global_max_refinement_change) {
      break;
    }
  }

  PrintHeading1("Extracting colors");
  reconstruction.ExtractColorsForAllImages(*options.image_path);

  const bool kDiscardReconstruction = false;
  mapper.EndReconstruction(kDiscardReconstruction);

  reconstruction.Write(output_path);

  return EXIT_SUCCESS;
}

I want to know how to use the above code? I am a newbie to colmap. I just simply run the colmap command to perform sparse reconstruction, so I don’t know how to execute the code you provided. Can you tell me the specific steps? Think you

Hi @tsattler and thank you for your work. I am trying to run the script you gave, following https://colmap.github.io/install.html#library, but I am getting an error when doing ninja, probably because I do not include the correct libraries on top of your script. As I am quite new to C++, I don't really know which libraries to include to your script. If you could please provide the libraries to include at the beginning of the script, that would be really helpful. Thanks a lot.

Here is my colmap version:

COLMAP 3.10-dev -- Structure-from-Motion and Multi-View Stereo
(Commit 7a1da99b on 2024-02-22 with CUDA)

Here is my CMakeLists.txt file:

cmake_minimum_required(VERSION 3.10)

project(RefineProject)

find_package(colmap REQUIRED)
# or to require a specific version: find_package(colmap 3.4 REQUIRED)

add_executable(refine refine.cc)
target_link_libraries(refine colmap::colmap)

And here are the libraries I include on top of refine.cc with your script under it:

#include <cstdlib>
#include <iostream>

#include <colmap/controllers/option_manager.h>
#include <colmap/util/string.h>

colmap / colmap

Triangulation with known image poses creates a noisier reconstruction compared to unknown poses #1057