erikbern / ann-benchmarks

Benchmarks of approximate nearest neighbor libraries in Python
http://ann-benchmarks.com
MIT License
4.94k stars 747 forks source link

Is there any plan to compare with Rayuela.jl ?? #131

Open zsz00 opened 5 years ago

zsz00 commented 5 years ago

Is there any plan to compare with Rayuela.jl ??

https://github.com/una-dinosauria/Rayuela.jl This is the code for the paper Julieta Martinez, Shobhit Zakhmi, Holger H. Hoos, James J. Little. LSQ++: Lower running time and higher recall in multi-codebook quantization. In ECCV 2018. [CVF].

Rayuela.jl is a package that implements non-orthogonal multi-codebook quantization methods (MCQ). These methods are useful for fast search of high-dimensional (d=100s-1000s) dense vectors. Rayuela is written in Julia.

erikbern commented 5 years ago

@zsz00 i would love to add it. maybe the authors can take a look. @una-dinosauria ?

maumueller commented 5 years ago

Given that it requires a GPU and doesn't have a Python wrapper, this would be a nice test case for our wrapper protocol https://github.com/erikbern/ann-benchmarks/tree/master/protocol :-)

@una-dinosauria I would be happy to help with that.

una-dinosauria commented 5 years ago

Hello everyone!

I'm super glad to see that you'd like to give my code a try. That said, I think it would be a bit hard for it to achieve competitive performance in this benchmark, as I have not implemented an IVF or any other shortlisting data structure -- In our ECCV18 paper, we assumed exhaustive search, as we compared only the encoding quality.

There would be two ways to get around this:

  1. Implement a quick and dirty IVF into Rayuela (apparently someone implemented this in Julia already: https://github.com/una-dinosauria/Rayuela.jl/issues/37), or
  2. Port our method into a library that already has shortlisting structures, such as FAISS (IMO, preferrable, since IIUC they have implemented more recent structures too). Rayuela has a C++ version of the code update method, and the codebook update should be easy to port as it only calls BLAS a few times.

Let me know if anyone is interested in pursuing this, and I'd be happy to provide more pointers.