yahoojapan / NGT

Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data
Apache License 2.0
1.24k stars 114 forks source link

Python bindings for QG/QBG #142

Open siddhsql opened 1 year ago

siddhsql commented 1 year ago

hello - i came across this repo yesterday. nice work! reading the documentation i understand ngt is deprecated in favor of higher performing qg/qbg libraries but python support for them is limited to only search. thus python cannot be used to build and save the index. do you have plan for adding more python support? when can we expect to see it? thanks.

masajiro commented 1 year ago

Hello, I don't think that NGT can be replaced by QG/QBG because they have different pros and cons. But in any case, I am going to implement python functions to build QG/QBG indices in the future. I can't say for sure when.

siddhsql commented 1 year ago

Hi Masajiro, re: I don't think that NGT can be replaced by QG/QBG because they have different pros and cons. the README file says: The command-line interface ngtq and ngtqg are now obsolete by replacing qbg https://github.com/yahoojapan/NGT/blob/main/bin/qbg/README.md. (v2.0.0)

also the README says: QBG https://github.com/yahoojapan/NGT/blob/main/bin/qbg/README.md can handle billions of objects does it mean it handles cases when the entire dataset does not fit in memory?

and README says (for QBG):

does it mean QBG does not support cosine distance measure?

On Thu, Jul 6, 2023 at 6:30 PM Masajiro Iwasaki @.***> wrote:

Hello, I don't think that NGT can be replaced by QG/QBG because they have different pros and cons. But in any case, I am going to implement python functions to build QG/QBG indices in the future. I can't say for sure when.

— Reply to this email directly, view it on GitHub https://github.com/yahoojapan/NGT/issues/142#issuecomment-1624507631, or unsubscribe https://github.com/notifications/unsubscribe-auth/A6NWEK424L2W37WXFHODSPDXO5RCJANCNFSM6AAAAAA2AX3IKE . You are receiving this because you authored the thread.Message ID: @.***>

masajiro commented 1 year ago

To be sure, in my last comment, NGT means not NGTQ nor neither NGTQG but the basic graph-based index.

QBG places the quantized dataset instead of the entire dataset in memory and supports only L2. However, I am going to add an option to place the entire dataset in memory and cosine similarity in the near future. This option is available only for sufficient memory.