Does the query item works well for all dimension size embeddings?

Stevenic / vectra

Vectra is a local vector database for Node.js with features similar to pinecone but built using local files.

MIT License

321 stars 29 forks source link

Thanks! There are a ton of alternatives to Vectra these days but it's still pretty good at what it does.

As for your question, it's likely going to depend on the quality of the embeddings used... The ability to specify the dimensions of your embeddings is a feature I recently added so I haven't used it a lot yet. I'm using it with nomic-embed-text-v1.5 via llama.cpp to get my embedding size down to 128 dimensions and the results seem ok from my testing. But my testing has been very adhoc...

Cosine similarity is the algorithm used to do the matching and is independent of the actual embeddings used. You're quality results are going to be driven largely by the embeddings model you use and the number of dimensions you ask for. There's a feature I could add called re-ranking which would help improve the results across the board but it would involve adding a keyword index into the mix which both complicates things and somewhat goes against the spirit of Vectra (simple, lightning fast, and free.)

Stevenic / vectra

Does the query item works well for all dimension size embeddings? #47