zilliztech / VectorDBBench

A Benchmark Tool for VectorDB
MIT License
455 stars 109 forks source link

Testbed Configuration Description #245

Closed rodrigonascimento closed 2 months ago

rodrigonascimento commented 7 months ago

Hi,

Looking at the published results, there is a column named "Databases with different hardware resources". Is there a description of the hardware/software resources used to achieve the results?

For instance, "Milvus-16c64g-hnsw" what does that mean? I'm guessing 16c = 16 cores, and 64g = 64GiB RAM. Is it a standalone deployment or a cluster?

Thanks,

--Rodrigo

alwayslove2013 commented 7 months ago

@rodrigonascimento All tested Milvus instances are standalone.

rodrigonascimento commented 5 months ago

Thanks, @alwayslove2013! Do you know if there is a description of the hardware used by each database? This is an important information while comparing the performance of two or more systems.

alwayslove2013 commented 5 months ago

@rodrigonascimento The hardware used is well-defined for open-source vector databases, such as Milvus-2c8g, which means 2 vcpu and 8g memory. However, most cloud databases do not provide explicit hardware descriptions and instead use their "custom notation". For example, Zillizcloud uses CU (compute unit), Pinecone uses pod type and size.

alwayslove2013 commented 5 months ago

@rodrigonascimento It is not appropriate to directly compare performance when hardware resources cannot be aligned. And we note that the cloud databases provide prices and we can compare them by price/performance ratio, such as QP$ (QPS per dollar).

rodrigonascimento commented 5 months ago

@alwayslove2013, you're right. I didn't express myself right. I've been running the benchmark in my infrastructure and getting results that do not align with the published results. I'm asking about the hw spec because I want to know if my servers have a similar config to the ones used in the publications.

alwayslove2013 commented 5 months ago

@XuanYang-cn Cloud you please provide information about the specific hardware that was used to test and deploy the milvus?

baiwfg2 commented 1 month ago

@alwayslove2013 Hi. After looking at the results, I find there're 2c8g, 4c16g and so on. Precisely speaking, we can't compare Milvus-16c64g-hnsw with PgVector-2c8g, right ? It's unfair comparison.

Another thing, Why is the only one that has 16c64g is only mivlus ? Other db can't be tested on such hardware ? This may make people think the results are biased.

alwayslove2013 commented 1 month ago

@baiwfg2 Absolutely it's unfair.

Precisely speaking, we can't compare Milvus-16c64g-hnsw with PgVector-2c8g, right ? It's unfair comparison.

I'd like to share our thoughts on how we choose the test instance type.

First, for most cloud databases, these vendors have their own units for commercial considerations, such as pinecone's p1.x1 / s1.x4, zillizcloud's 1cu-perf / 4cu-cap, and weaviate, where they do not disclose the specific machine resources.

So when we choose the test instance, we can only judge based on the dataset and select the "minimum" instance that can accommodate the dataset (the "minimum" comes from our inference) to align the computing resources roughly. For example, for the Cohere 10M * 768dim test, the chosen instances are milvus-16c64g, zillizcloud-8cu-perf / 2cu-cap, and pinecone p2.x1.8node, etc.

As for why there are no 16c64g tests for pgvector, the reason is that last year, when we tested Cohere 1M with a 2c8g pgvector, the performance was not satisfactory, so we did not continue to test larger models.

Currently, pgvector has been greatly updated, and we would like to thank the community developers for contributing the latest test client for pgvector, which will help us test pgvector's performance more accurately.

You can use our latest vdbbench to conduct your own tests. Additionally, we plan to restart the pgvector testing in the near future, but this will take some time. Please stay tuned for updates.