norlab-ulaval / libnabo

A fast K Nearest Neighbor library for low-dimensional spaces
http://norlab-ulaval.github.io/libnabo/
BSD 3-Clause "New" or "Revised" License
438 stars 143 forks source link

Fail on matrices with ~32768 or more floats, and issues with const row matrices. #9

Closed idryanov closed 9 years ago

idryanov commented 10 years ago

I get a segfault when attempting to do a knn search on an Eigen float matrix with 3 rows and more than 10920 columns. Interestingly, that works out to close to 2^15 floats.

I compiled a minimum test program, included below, and the gdb output. In short, it runs fine with a 3 x 10920 matrix, and segfaults on a 3 x 10921 matrix

Here's the code:

#include <iostream>
#include <Eigen/Core>
#include <nabo/nabo.h>

// OK:    `./test_nns 10920`
// FAIL:  `./test_nns 10921`
int main(int argc, char* argv[]) {
  const int n_points = atoi(argv[1]);
  const int point_idx = 10;
  const int k_nearest_neighbors = 1;

  std::cout << "Building matrix with " << n_points << " points. " << std::endl;
  Eigen::Matrix<float, 3, Eigen::Dynamic> points = 
      Eigen::MatrixXf::Random(3, n_points);
  const Eigen::Matrix<float, 3, 1>& point = points.col(point_idx);

  // Find the nearest neighbors.
  Nabo::NNSearchF* nns = Nabo::NNSearchF::createKDTreeLinearHeap(points);
  Eigen::VectorXi nn_indices(k_nearest_neighbors);
  Eigen::VectorXf nn_distances_sq(k_nearest_neighbors);
  nns->knn(point, nn_indices, nn_distances_sq, k_nearest_neighbors);
  std::cout << "neighbor: " << nn_indices(0) << std::endl;

  delete nns;
  return 0;
}

Here's the gdb output:

gdb test_nns
GNU gdb (GDB) 7.6.1-ubuntu
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/idryanov/work/Tango/redwood_ws/devel_linux/bin/test_nns...(no debugging symbols found)...done.
(gdb) run 10920
Starting program: /home/idryanov/work/Tango/redwood_ws/devel_linux/bin/test_nns 10920
Building matrix with 10920 points. 
neighbor: 2835
[Inferior 1 (process 9530) exited normally]
(gdb) run 10921
Starting program: /home/idryanov/work/Tango/redwood_ws/devel_linux/bin/test_nns 10921
Building matrix with 10921 points. 

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bc9aed in unsigned long Nabo::KDTreeUnbalancedPtInLeavesImplicitBoundsStackOpt<float, Nabo::IndexHeapBruteForceVector<int, float> >::recurseKnn<false, false>(float const*, unsigned int, float, Nabo::IndexHeapBruteForceVector<int, float>&, std::vector<float, std::allocator<float> >&, float, float) const () from /home/idryanov/work/Tango/redwood_ws/devel_linux/lib/libnabo.so.1
(gdb) 
idryanov commented 10 years ago

Update:

I managed to narrow down the issue somewhat.

Building a NNSearchF from MatrixXf works as expected, using large numbers of points.

  Eigen::MatrixXf points = Eigen::MatrixXf::Random(3, n_points);

Replacing that with a matrix with fixed rows will exhibit the failures described above (fail on 10921 or more columns)

  Eigen::Matrix<float, 3, Eigen::Dynamic> points =
    Eigen::Matrix<float, 3, Eigen::Dynamic>::Random(3, n_points);

Moreover, even when not segfaulting, passing a Matrix3Xf instead of a Matrixf seems to sometimes be producing the wrong knn results.

I have managed to work around the problem by evaluating all my Matrix3Xf mats to Matrix3f - however, it's not immediately obvious (to me) how this is creating an issue.

idryanov commented 10 years ago

@simonlynen as per our offline conversation

simonlynen commented 10 years ago

@idryanov This rather seems to be an issue in Eigen. It is actually not obvious to me how a reference of (partially) fixed size can be allowed to be created to a dynamic matrix (given that this can be compile-time checked).

It seems this issue is resolved best by looking into how Eigen determines between fixed and dynamic matrices for block operations etc. as used by Nabo.

simonlynen commented 9 years ago

This could only be resolved by actually passing the correct derived type into nabo and holding the correct reference type. This would however mean that the entire library would need to be a template which is suboptimal. For now the fix to this is to use the correct Eigen type of dynamic sizes on both dimensions.