Closed wbrandenburger closed 5 years ago
Yes, you are correct, this is an issue that needs fixing.
That is indeed the main problem, but the same goes for domain-distances which is also allocated repeatedly. I've solved this while still maintaining thread-friendliness by initially allocating heaps for the desired number of threads and during query finding a heap not currently in use by other threads (same for domain-distances). Can provide code if someone comes across this and is interested.
Unrelatedly, kmeans-tree does not support addition of new elements after init. Have also patched this (albeit a bit more awkwardly), will be happy to help.
It is possible to use a sparse matrix for input dataset with a dimension of 1M x 1M. If you initialize such dimension with flann::Matrix it will be not so efficient.
Hi,
FLANN is not efficient! But with a few changes you can make it efficient. In kdtree_index.h, kmeans_index.h, hierarchical_clustering_index.h each function getNeighbors(...) calls a heap which has as many elements as the pointcloud has elements. That means that FLANN builds for every query point a structure that has the same size as the pointcloud. Bu this structure should have only knnlog(size(Pointcloud)) elements, reffering to the complexity a search in kdtree O(knnlog(n)). In addition you have to replace the DynamicBitset with a selfbalancing Tree. When you apply both changes you are thousandfold faster when you compute 20 nearest neigbors of every point in a pointcloud which contains 10 000 000 points. The bigger the pointcloud the faster you are in comparison to prevoius implementation of FLANN.
Greets from Munich