milvus-io / knowhere

Knowhere is an open-source vector search engine, integrating FAISS, HNSW, etc.
Apache License 2.0
207 stars 83 forks source link

Adopt new strategy for faiss IVF range search #1016

Closed cydrain closed 1 year ago

mergify[bot] commented 1 year ago

@cydrain Please associate the related issue to the body of your Pull Request. (eg. “issue: #”)

cydrain commented 1 year ago

/kind improvement

PwzXxm commented 1 year ago

This PR removes IVF Range Search nprobe parameter and searches until no results are found.

My concern is that the radius might result in many buckets only having a few or even one point within the radius. Thus the performance will have a vast degradation. Is there any performance report?

PwzXxm commented 1 year ago

Checked the performance report internally. I'm not sure whether the performance gain is from pulling up the bitset check or the algorithm itself.

The UX here changes, the user can no longer adjust their QPS/Recall at the fixed radius. Is this the way we want to go down? @liliu-z

liliu-z commented 1 year ago

Checked the performance report internally. I'm not sure whether the performance gain is from pulling up the bitset check or the algorithm itself.

The UX here changes, the user can no longer adjust their QPS/Recall at the fixed radius. Is this the way we want to go down? @liliu-z

  1. Growing search need range search without any param.
  2. We need align the API definition for all Knowhere indexes. Also, for range search, users can not balance recall/QPS for DiskANN/HNSW's implementation.

If necessary, what we should do it to expose another set of APIs with definition that allows users to balance recall/QPS instead of mixing them.

liliu-z commented 1 year ago

This PR removes IVF Range Search nprobe parameter and searches until no results are found.

My concern is that the radius might result in many buckets only having a few or even one point within the radius. Thus the performance will have a vast degradation. Is there any performance report?

https://zilliverse.feishu.cn/docx/JMVMd91KbodZKqxs2TrcIWywn5F BTW, I think we need to align the definition of interface first. Then we can see how to balance recall/QPS by modifying the tolerance.

sre-ci-robot commented 1 year ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cydrain, liliu-z

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/milvus-io/knowhere/blob/main/OWNERS)~~ [cydrain,liliu-z] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment