johnmyleswhite / kNN.jl

The k-nearest neighbors algorithm in Julia
Other
22 stars 17 forks source link

Use FLANN #6

Open lindahua opened 10 years ago

lindahua commented 10 years ago

FLANN (http://www.cs.ubc.ca/research/flann/) is one of the most widely used library for approximate nearest neighbor search.

It is fast & reliable, available in Linux distro & Homebrew, and has a C interface.

johnmyleswhite commented 10 years ago

Yes, we should definitely use FLANN.

johnmyleswhite commented 10 years ago

There are also a few other libraries we will want to look into at some point: http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neighbours-intro/

johnmyleswhite commented 10 years ago

This post is actually more informative: http://radimrehurek.com/2013/12/performance-shootout-of-nearest-neighbours-contestants/

lindahua commented 10 years ago

From this post, it appears to me that FLANN is the most reasonable choice at this point.

lindahua commented 10 years ago

I would suggest having a separate package (say FLANN.jl) as a wrapper, and let this depend on it.

johnmyleswhite commented 10 years ago

Yes, I think that's the right approach.

wildart commented 10 years ago

Using FLANN requires manual memory management, because it maintains in-memory index. How does it fit into a proposed workflow of creating a model and using it for multiple predictions? It would require either clear resources the at the end of model usage or recalculate indexes every time when searching.

lindahua commented 10 years ago

Just treat the FLANN index like we treat other library that holds external resources (e.g. database connections).

We require the user to free the index when they have finished using it.