unum-cloud / usearch

Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
https://unum-cloud.github.io/usearch/
Apache License 2.0
2.15k stars 130 forks source link

Feature: Update API for automatic reindexing #323

Open kernelsoe opened 9 months ago

kernelsoe commented 9 months ago

Describe what you are looking for

There are already add, remove APIs but for frequently updating data like a user's notes, it'd be nice to have a update API. 🙏

Can you contribute to the implementation?

Is your feature request specific to a certain interface?

C++ implementation

Contact Details

kernelsoe@gmail.com

Is there an existing issue for this?

Code of Conduct

ashvardanian commented 9 months ago

Thank you @kernelsoe! Such an API indeed makes sense. We can implement it at least slightly more efficiently than the remove & add combo, by reusing the same memory buffers under the hood. However, generally, remove is discouraged. As part of any operation or standalone.

How many vectors do you generally deal with?

kernelsoe commented 9 months ago

@ashvardanian For my use case, I'm trying to load and add a few hundred user notes from sqlite into usearch on app startup without remove op then as the user edits 1 or reflecting multiple notes, I'm reindexing by removing and adding all with the same key again.

Doing this combined operation for multiple times in 1 function call would sometimes throw error, so I did something like below for now 😄:

for (const payload of payloads) {
  await remove()...
  await timeout(300)
}

for (const payload of payloads) {
  await add()...
  await timeout(300)
}

May be I should use batch op?

sroussey commented 9 months ago

Aside: There is a sqlite plugin sqlite-vss that adds vector search, but unfortunately uses FAISS Flat not HNSW. In addition, updates are not supported, so it seems the issue is elsewhere. I think pg-vector rolled their own.

While I wish this was inside sqlite, I am looking forward to using it side by side like you are.