zilliztech / VectorDBBench

A Benchmark Tool for VectorDB
MIT License
455 stars 109 forks source link

Optimize pgvector test for semi-recent enhancements #319

Closed jkatz closed 1 month ago

jkatz commented 1 month ago

This commit adds several changes to the pgvector test to create a more representative test environment based on recent and older changes to pgvector. Notable changes include allowing for testing of parallel index buiding parameters, using loading with the recommended binary loading method, and other changes to better emulate what a typical user of pgvector would do.

This commit also has some general cleanups as well.

Co-authored-by: Mark Greenhalgh greenhal@users.noreply.github.com Co-authored-by: Tyler House tahouse@users.noreply.github.com

XuanYang-cn commented 1 month ago

/assign @alwayslove2013 /assign

jkatz commented 1 month ago

@XuanYang-cn @alwayslove2013 Please let us know if this PR requires additional work. There are some other changes we'd like to include for testing other configurations of pgvector, but we'd like to baseline it against the flat implementation first. Thanks!

jkatz commented 1 month ago

@alwayslove2013 Thanks for the feedback! This is resolved in the latest push.

Overall, I would suggest moving to psycopg3 (psycopg) as it's now the maintained version of psycopg; however, that change could be made in a separate pull request.

alwayslove2013 commented 1 month ago

@jkatz Thank you so much for your contribution! We greatly appreciate it and are thrilled to receive your pull request. We look forward to collaborating with you and driving the project forward together!

Overall, I would suggest moving to psycopg3 (psycopg) as it's now the maintained version of psycopg; however, that change could be made in a separate pull request.

jkatz commented 1 month ago

@alwayslove2013 Likewise. I personally appreciate the approach VectorDBBench takes around testing concurrency, which resembles how users interact with databases.

I've pushed up the fix to the latest patch to handle the merge conflict that remained (which I'm still baffled how that got in, but I'll triple check next time).

sre-ci-robot commented 1 month ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: alwayslove2013, jkatz To complete the pull request process, please assign xuanyang-cn after the PR has been reviewed. You can assign the PR to them by writing /assign @xuanyang-cn in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/zilliztech/VectorDBBench/blob/main/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
alwayslove2013 commented 1 month ago

@jkatz I would like to express my sincere gratitude for your support. Our primary goal has been to ensure that the test data reflects the performance characteristics of the real-world usage scenarios as accurately as possible.

... testing concurrency, which resembles how users interact with databases.

If you have any suggestions or innovative ideas with VDBBench, we would be more than happy to discuss them with you. Your valuable input is crucial for us to enhance the functionality and user experience of the tool.