tensorchord / pgvecto.rs

Scalable, Low-latency and Hybrid-enabled Vector Search in Postgres. Revolutionize Vector Search, not Database.
https://docs.pgvecto.rs/getting-started/overview.html
Apache License 2.0
1.53k stars 60 forks source link

feat: ScaNN index #20

Open gaocegege opened 1 year ago

karajan1001 commented 1 year ago

I'll take this.

VoVAllen commented 1 year ago

@karajan1001 ScaNN might be hard to implement since original code highly coupled with tensorflow. I would suggest DiskANN, since it's designed for disk workload, which is suitable for postgres. And it's also adapted by Milvus.

karajan1001 commented 1 year ago

@karajan1001 ScaNN might be hard to implement since original code highly coupled with tensorflow. I would suggest DiskANN, since it's designed for disk workload, which is suitable for postgres. And it's also adapted by Milvus.

Yes, I had looked into their paper and source code and noticed that it's highly coupled with TensorFlow. So it's hard to say how much performance is gotten from the algorithm and how much is gotten from the TensorFlow optimization. I was just about to ask if implementing a ScaNN without the TensorFlow optimization would be a good choice.

gaocegege commented 2 months ago

https://services.google.com/fh/files/misc/scann_for_alloydb_whitepaper.pdf

Here is a whitepaper for SCANN in AlloyDB.