Stevenic / vectra

Vectra is a local vector database for Node.js with features similar to pinecone but built using local files.
MIT License
321 stars 29 forks source link

[Question] Limiting index size #3

Closed tlaanemaa closed 1 year ago

tlaanemaa commented 1 year ago

Hello,

First of all, I want to congratulate you on the fantastic work you've done with this package! 🎉

I'm considering using the vectra package as a backend for a project running on Node, and was hoping to get some guidance on managing the index size. My goal is to limit the index size to a predefined number, such as 100,000. I'd like to implement this kind of functionality whenever a new item is added:

items = items.slice(items.length - 100000)

By doing this, I would retain only the last 100,000 values added to the index (ordered by when they were added). Is there a way to achieve this with vectra?

Additionally, I was wondering if you could share some insights on the reasonable limits for the index size, given that it employs a linear search. Do you think an index size of 100,000 would be feasible? How about 1 million?

I appreciate your assistance and look forward to your response!

Best regards,

Stevenic commented 1 year ago

Thanks @tlaanemaa ! You should be able to write a function like this to limit the number of items in the index:

async function pruneIndex(index, max_items) {
   // check index against retention policy
   const stats = await index.getIndexStats();
   if (stats.items < max_items) {
      return;
   }

   // Remove oldest items first
   await index.beginUpdate();    
   try {
      const items = await index.listItems();
      while (items.length > max_items) {
         const item = items.unshif();
         await index.deleteItem(item.id);
      }
      await index.endUpdate();
   } catch (err) {
      await index.cancelUpdate();
      throw err;
   }
} 

So a couple of thoughts for index size... Yes it's a linear search but it's still going to be faster then calling an external DB. With regards to index size that really just depends on how much memory you want to throw at it. if you're looking to store a ton of items in an index, I'd be very selective about what metadata I store in the index itself. You could either use Vectra's ability to store all metadata externally (for 100,000+ items I'd strongly consider that) or just don't store anything super big metadata wise. If indexing documents, for example, consider storing position offsets for where a chunk starts & stops instead of the chunk text itself. You can then read the chunk in from the file using those offsets only when they're needed.

Another technique would be to organize your index into namespaces. Vectra doesn't have a direct namespace concept but you can easily mimic namespaces by creating a separate folder and LocalIndex for each namespace. You can then add logic to only keep the last 5 name spaces loaded into memory at any given point...

With Vectra you actually have all the raw pieces needed to build a fairly large and capable DB. Some assembly required though... You'll probably get better perf optimizations on the metadata filtering side of things using a real DB but for the core vector search task it should work and scale as well as anything else...

I'll also add that the early versions of Microsoft Exchange were just a bunch of files on disk so don't underestimate what you can build using just a bunch of files and folders.

tlaanemaa commented 1 year ago

Nice, thanks. Yea, I think keeping the documents separately is the way to go. I'm looking to use this to keep an "infinite" memory for a chatbot, tho nothing is infinite so I needed some sort of eviction logic. I'll give your code a go. Have you considered adding some sort of eviction logic to the library itself? I think first-in-first-out and least-recently-used would be two very good options, especially for chatbots with rolling memory.

Btw how do you keep metadata separately with Vectra? I didn't manage to find anything in the readme.

Thanks!