At Qdrant, performance is the top-most priority. We always make sure that we use system resources efficiently so you get the fastest and most accurate results at the cheapest cloud costs. So all of our decisions from choosing Rust, io optimisations, serverless support, binary quantization, to our fastembed library are all based on our principle. In this article, we will compare how Qdrant performs against the other vector search engines.
Here are the principles we followed while designing these benchmarks:
We do comparative benchmarks, which means we focus on relative numbers rather than absolute numbers.
We use affordable hardware, so that you can reproduce the results easily.
We run benchmarks on the same exact machines to avoid any possible hardware bias.
All the benchmarks are open-sourced, so you can contribute and improve them.
Scenarios we tested
Some of our experiment design decisions are described in the F.A.Q Section. Reach out to us on our Discord channel if you want to discuss anything related Qdrant or these benchmarks.
Single node benchmarks
We benchmarked several vector databases using various configurations of them on different datasets to check how the results may vary. Those datasets may have different vector dimensionality but also vary in terms of the distance function being used. We also tried to capture the difference we can expect while using some different configuration parameters, for both the engine itself and the search operation separately.
Most of the engines have improved since our last run. Both life and software have trade-offs but some clearly do better:
Qdrant achieves highest RPS and lowest latencies in almost all the scenarios, no matter the precision threshold and the metric we choose. It has also shown 4x RPS gains on one of the datasets.
Elasticsearch has become considerably fast for many cases but it’s very slow in terms of indexing time. It can be 10x slower when storing 10M+ vectors of 96 dimensions! (32mins vs 5.5 hrs)
Milvus is the fastest when it comes to indexing time and maintains good precision. However, it’s not on-par with others when it comes to RPS or latency when you have higher dimension embeddings or more number of vectors.
Redis is able to achieve good RPS but mostly for lower precision. It also achieved low latency with single thread, however its latency goes up quickly with more parallel requests. Part of this speed gain comes from their custom protocol.
Weaviate has improved the least since our last run. Because of relative improvements in other engines, it has become one of the slowest in terms of RPS as well as latency.
Suggested labels
{'label-name': 'Database-Performance', 'label-description': 'Focuses on benchmarking and comparing the performance of different vector databases.', 'confidence': 51.15}
Vector Database Benchmarks - Qdrant
DESCRIPTION:
Benchmarking Vector Databases
At Qdrant, performance is the top-most priority. We always make sure that we use system resources efficiently so you get the fastest and most accurate results at the cheapest cloud costs. So all of our decisions from choosing Rust, io optimisations, serverless support, binary quantization, to our fastembed library are all based on our principle. In this article, we will compare how Qdrant performs against the other vector search engines.
Here are the principles we followed while designing these benchmarks:
Scenarios we tested
Some of our experiment design decisions are described in the F.A.Q Section. Reach out to us on our Discord channel if you want to discuss anything related Qdrant or these benchmarks.
Single node benchmarks
We benchmarked several vector databases using various configurations of them on different datasets to check how the results may vary. Those datasets may have different vector dimensionality but also vary in terms of the distance function being used. We also tried to capture the difference we can expect while using some different configuration parameters, for both the engine itself and the search operation separately.
Updated: January 2024
Download raw data: here
Observations
Most of the engines have improved since our last run. Both life and software have trade-offs but some clearly do better:
Suggested labels
{'label-name': 'Database-Performance', 'label-description': 'Focuses on benchmarking and comparing the performance of different vector databases.', 'confidence': 51.15}