irthomasthomas / undecidability

2 stars 1 forks source link

Vector Database Benchmarks - Qdrant #738

Open irthomasthomas opened 2 months ago

irthomasthomas commented 2 months ago

Vector Database Benchmarks - Qdrant

DESCRIPTION:
Benchmarking Vector Databases

At Qdrant, performance is the top-most priority. We always make sure that we use system resources efficiently so you get the fastest and most accurate results at the cheapest cloud costs. So all of our decisions from choosing Rust, io optimisations, serverless support, binary quantization, to our fastembed library are all based on our principle. In this article, we will compare how Qdrant performs against the other vector search engines.

Here are the principles we followed while designing these benchmarks:

Scenarios we tested

Some of our experiment design decisions are described in the F.A.Q Section. Reach out to us on our Discord channel if you want to discuss anything related Qdrant or these benchmarks.

Single node benchmarks

We benchmarked several vector databases using various configurations of them on different datasets to check how the results may vary. Those datasets may have different vector dimensionality but also vary in terms of the distance function being used. We also tried to capture the difference we can expect while using some different configuration parameters, for both the engine itself and the search operation separately.

Updated: January 2024

Engine Setup Dataset Upload Time(m) Upload + Index Time(m) P95(ms) RPS P99(ms) Latency(ms) Precision
qdrant qdrant-sq-rps-m-32-ef-256 deep-image-96-angular 4.94 43.81 64.42 1608.64 124.57 60.51 0.92
elasticsearch elasticsearch-m-32-ef-128 deep-image-96-angular 42.66 324.79 100.069 1151.34 115.11 84.58 0.93
redis redis-m-32-ef-512 deep-image-96-angular 526.56 526.56 124.26 834.080 128.35 115.22 0.92
weaviate weaviate-m-16-ef-128 deep-image-96-angular 70.81 70.81 545.95 640.37 922.11 154.72 0.94
milvus milvus-m-32-ef-128 deep-image-96-angular 4.53 37.45 212.64 580.72 237.94 168.82 0.98

Download raw data: here

Observations

Most of the engines have improved since our last run. Both life and software have trade-offs but some clearly do better:

Suggested labels

{'label-name': 'Database-Performance', 'label-description': 'Focuses on benchmarking and comparing the performance of different vector databases.', 'confidence': 51.15}

irthomasthomas commented 2 months ago

Related content

304

Similarity score: 0.86

625

Similarity score: 0.86

641

Similarity score: 0.86

456

Similarity score: 0.85

74

Similarity score: 0.85

386

Similarity score: 0.85