Support inner product distance

timescale / pgvectorscale

A complement to pgvector for high performance, cost efficient vector search on large workloads.

PostgreSQL License

1.31k stars 56 forks source link

Open JHawk0224 opened 5 months ago

JHawk0224 commented 5 months ago

Right now only cosine is supported, but it would be great to have support for inner product (<#>) as well!

cevian commented 5 months ago

Please vote on this issue of there is interest. We are also happy to get PRs. Otherwise, we'll prioritize this on our roadmap too.

cevian commented 5 months ago

@JHawk0224 It would also be useful to know which models need this? i.e. why this is important for your use-case.

irowberryFS commented 5 months ago

All OpenAI embedding models normalize their embeddings. So inner product (as far as I know) is faster than cosine.

cho-thinkfree-com commented 2 months ago

BgeM3 provides a sparse vector for each token. Using the inner product on these sparse vectors makes it easy to find out which tokens are included.