timescale / pgvectorscale

A complement to pgvector for high performance, cost efficient vector search on large workloads.
PostgreSQL License
610 stars 23 forks source link

Feature Request: Accesses underlying clusters/groups #103

Open irowberryFS opened 2 weeks ago

irowberryFS commented 2 weeks ago

If I understand the idea behind DiskANN (I may be completely misunderstanding it), it performs clustering for free as a result of building an index (like HNSW). It would be an amazing feature to be able to get each vector's "cluster". This would be really useful for entity resolution / de-duplication / blocking, without having to query for every point in the database.

oliver-kriska commented 1 week ago

< 2000, so it has to be 1999 or less

cevian commented 1 week ago

Unfortunately, DiskANN doesn't do clustering so we can't access it. But, we'll consider adding clustering functions in the future.