flann-lib / flann

Fast Library for Approximate Nearest Neighbors
http://people.cs.ubc.ca/~mariusm/flann
Other
2.22k stars 649 forks source link

radix_sort invalid argument when using CUDA flann #388

Closed fferroni closed 5 years ago

fferroni commented 5 years ago

Hello,

Trying to use the CUDA implementation of FLANN. I set up a test like the following:

#include <gtest/gtest.h>
#include "flann/flann.hpp"

TEST(Flann_3D_Random_Cloud, Test1NN)
{
    flann::Matrix<float> data;
    flann::Matrix<float> query;
    flann::Matrix<float> dists;
    flann::Matrix<size_t> indices;
    flann::Matrix<float> gt_dists;
    flann::Matrix<size_t> gt_indices;

    const int n_points=50000;
    printf("creating random point cloud (%d points)...", n_points);
    data = flann::Matrix<float>(new float[n_points*3], n_points, 3);
    srand(1);
    for( int i=0; i<n_points; i++ )
    {
        data[i][0]=std::rand()/float(RAND_MAX);
        data[i][1]=std::rand()/float(RAND_MAX);
        data[i][2]=std::rand()/float(RAND_MAX);
    }

    query= flann::Matrix<float>(new float[n_points*3], n_points, 3);
    for( int i=0; i<n_points; i++ )
    {
        query[i][0]=data[i][0];
        query[i][1]=data[i][1];
        query[i][2]=data[i][2];
    }

    printf("done\n");

    float r = 0.01;

    clock_t start_cpu = clock();
    flann::Index<flann::L2<float> > index(data, flann::LinearIndexParams());
    index.buildIndex();
    index.radiusSearch(data, gt_indices, gt_dists, r, flann::SearchParams() );
    clock_t end_cpu = clock();
    printf("CPU build and query done in %f\n", float(end_cpu - start_cpu));

    clock_t start_gpu = clock();
    flann::Index<flann::L2_Simple<float> > index2(data, flann::KDTreeCuda3dIndexParams());
    index2.buildIndex();
    index2.radiusSearch(query, indices, dists, r, flann::SearchParams() );
    clock_t end_gpu = clock();
    printf("GPU build and query done in %f\n", float(end_gpu - start_gpu));
}

Unfortunately I am getting the following error:

Running main() from gtest_main.cc
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Flann_3D_Random_Cloud
[ RUN      ] Flann_3D_Random_Cloud.Test1NN
creating random point cloud (50000 points)...done
CPU build and query done in 72752128.000000
unknown file: Failure
C++ exception with description "radix_sort: failed on 2nd step: invalid argument" thrown in the test body.
[  FAILED  ] Flann_3D_Random_Cloud.Test1NN (72842 ms)
[----------] 1 test from Flann_3D_Random_Cloud (72842 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (72842 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] Flann_3D_Random_Cloud.Test1NN

 1 FAILED TEST

Depending on the size r the test passes or fails. For example, if I change r to 0.001 it passes.

Running main() from gtest_main.cc
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Flann_3D_Random_Cloud
[ RUN      ] Flann_3D_Random_Cloud.Test1NN
creating random point cloud (50000 points)...done
CPU build and query done in 71665168.000000
GPU build and query done in 94559.000000
[       OK ] Flann_3D_Random_Cloud.Test1NN (71767 ms)
[----------] 1 test from Flann_3D_Random_Cloud (71767 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (71767 ms total)
[  PASSED  ] 1 test.

Am I using this incorrectly? I looked at the test examples and cannot see a major difference.

qpc001 commented 1 year ago

@fferroni How to fix it?