getkeops / keops

KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows
https://www.kernel-operations.io
MIT License
1.04k stars 64 forks source link

maximum inner product search (MIPS) #175

Closed Chen-Cai-OSU closed 2 years ago

Chen-Cai-OSU commented 3 years ago

Hello,

Thank you very much for the amazing work. I really like extensive experiments in your paper. I was wondering have you ever looked at MIPS? https://ai.googleblog.com/2020/07/announcing-scann-efficient-vector.html

It's also a very important problem and I am curious that the performance compared to other solutions such as scann and faiss. I am interested in both brute force and none brute-force approaches, just like KNN in your paper.

I also want to confirm one thing with you:

Thank you!

jeanfeydy commented 3 years ago

Hi @Chen-Cai-OSU ,

Thanks for your interest in the library! Indeed, this is a problem that we have worked on quite a bit: in our KNN benchmarks, we refer to MIPS as to "KNN search with a cosine similarity". (Note that we tend to normalize vectors in these benchmarks, but this has no influence on run times for bruteforce implementations.)

To answer your questions:

What do you think? Best regards, Jean

Chen-Cai-OSU commented 3 years ago

Thank you so much! KeOps looks amazing. And your answer clears out some of my doubts. I will definitely take a try.

My dataset can be quite large: 100M samples with D=64. I will start with some sampling and move to full dataset. Will use this thread to keep you updated.

jeanfeydy commented 3 years ago

I see! Good luck then, and please let us know about your experience with KeOps on such a large-scale problem :-)

Best regards, Jean