colmap / pycolmap

Python bindings for COLMAP
BSD 3-Clause "New" or "Revised" License
858 stars 125 forks source link

Autocast numpy.ndarray <-> std::vector<Eigen::Vector> #248

Closed sarlinpe closed 5 months ago

sarlinpe commented 5 months ago

For large inputs, the default pybind type caster is much slower than directly copying the underlying memory segment, as hinted by https://github.com/pybind/pybind11/issues/1481.

Test code:

  using Matrix =
      typename Eigen::Matrix<double, Eigen::Dynamic, 2, Eigen::RowMajor>;
  using Vector = typename Eigen::Vector2d;
  m.def("test_vector", [](const std::vector<Vector>& x) { return x; });
  m.def(
      "test_map",
      [](const Eigen::Ref<Matrix>& x) {
        // Cast to std::vector
        std::vector<Vector> vec(x.rows());
        Eigen::Map<Matrix>(
            reinterpret_cast<double*>(vec.data()), vec.size(), 2) = x;
        // Cast back to Eigen::Matrix
        return Eigen::Map<const Matrix>(
            reinterpret_cast<const double*>(vec.data()), vec.size(), 2);
      },
      py::return_value_policy::copy);
  m.def("test_matrix", [](const Eigen::Ref<Matrix>& x) { return x; });
n = 100_000
print(f'{"size":>5}: vector   map      matrix')
for i in [10, 100, 1_000, 10_000]:
    x = np.random.randn(i, 2)
    assert np.all(pycolmap.test_vector(x) == x)
    t_vec = timeit.timeit(lambda: pycolmap.test_vector(x), number=n)/n
    t_map = timeit.timeit(lambda: pycolmap.test_map(x), number=n)/n
    t_mat = timeit.timeit(lambda: pycolmap.test_matrix(x), number=n)/n
    print(f'{i:>5}: {t_vec:.2E} {t_map:.2E} {t_mat:.2E}')

Timings (second) without the type caster:

 size: vector   map      matrix
   10: 6.95E-06 7.71E-07 4.90E-07
  100: 6.90E-05 7.15E-07 5.28E-07
 1000: 6.82E-04 1.04E-06 5.16E-07
10000: 7.04E-03 5.77E-06 5.03E-07

With:

 size: vector
   10: 1.02E-06
  100: 1.29E-06
 1000: 2.25E-06
10000: 8.94E-05

This is a 100x speedup for 10k inputs. There is still a gap w.r.t direct mapping for C++ -> Python, but I haven't managed to find the source. Ideally COLMAP would use Eigen::Matrix when possible.