qdrant / vector-db-benchmark

Framework for benchmarking vector search engines
https://qdrant.tech/benchmarks/
Apache License 2.0
250 stars 68 forks source link

Qdrant Internal Server Error on recreate timeout for > 40M vectors: Waiting for Consensus Operation Commit Failed #162

Open filipecosta90 opened 1 month ago

filipecosta90 commented 1 month ago

Encountered an unexpected 500 (Internal Server Error) while using the Qdrant vector benchmark tool on a remote qdrant setup. This happens when we want to run multiple experiments with different index definitions one after the other (run benchmark -> recreate -> etc.... ).

The raw response content is as follows:

UnexpectedResponse: Unexpected Response: 500 (Internal Server Error)
Raw response content:
b'{"status":{"error":"Service internal error: Waiting for consensus operation commit failed. Timeout set at: 300 seconds"},"time":300.000856421}'

Unwinding of error:

│ /root/vector-db-benchmark/engine/base_client/client.py:107 in run_experiment                     │
│                                                                                                  │
│   104 │   │                                                                                      │
│   105 │   │   if not skip_upload:                                                                │
│   106 │   │   │   print("Experiment stage: Configure")                                           │
│ ❱ 107 │   │   │   self.configurator.configure(dataset)                                           │
│   108 │   │   │                                                                                  │
│   109 │   │   │   print("Experiment stage: Upload")                                              │
│   110 │   │   │   upload_stats = self.uploader.upload(                                           │
│                                                                                                  │
│ ╭──────────────────────────────────────── locals ────────────────────────────────────────╮       │
│ │          dataset = <benchmark.dataset.Dataset object at 0x7da84cfdfc10>                │       │
│ │ execution_params = {}                                                                  │       │
│ │ existing_results = []                                                                  │       │
│ │        parallels = []                                                                  │       │
│ │           reader = <dataset_reader.ann_h5_reader.AnnH5Reader object at 0x7da84cfdfee0> │       │
│ │             self = <engine.base_client.client.BaseClient object at 0x7da84cd653c0>     │       │
│ │   skip_if_exists = True                                                                │       │
│ │      skip_search = False                                                               │       │
│ │      skip_upload = False                                                               │       │
│ ╰────────────────────────────────────────────────────────────────────────────────────────╯       │
│                                                                                                  │
│ /root/vector-db-benchmark/engine/base_client/configure.py:22 in configure                        │
│                                                                                                  │
│   19 │                                                                                           │
│   20 │   def configure(self, dataset: Dataset) -> Optional[dict]:                                │
│   21 │   │   self.clean()                                                                        │
│ ❱ 22 │   │   return self.recreate(dataset, self.collection_params) or {}                         │
│   23 │                                                                                           │
│   24 │   def execution_params(self, distance, vector_size) -> dict:                              │
│   25 │   │   return {}                                                                           │
│                                                                                                  │
│ ╭──────────────────────────────────────── locals ─────────────────────────────────────────╮      │
│ │ dataset = <benchmark.dataset.Dataset object at 0x7da84cfdfc10>                          │      │
│ │    self = <engine.clients.qdrant.configure.QdrantConfigurator object at 0x7da84cd65a20> │      │
│ ╰─────────────────────────────────────────────────────────────────────────────────────────╯      │
│                                                                                                  │
│ /root/vector-db-benchmark/engine/clients/qdrant/configure.py:43 in recreate                      │
│                                                                                                  │
│   40 │   │   res = self.client.delete_collection(collection_name=QDRANT_COLLECTION_NAME)         │
│   41 │                                                                                           │
│   42 │   def recreate(self, dataset: Dataset, collection_params):                                │
│ ❱ 43 │   │   self.client.recreate_collection(                                                    │
│   44 │   │   │   collection_name=QDRANT_COLLECTION_NAME,                                         │
│   45 │   │   │   vectors_config=rest.VectorParams(                                               │
│   46 │   │   │   │   size=dataset.config.vector_size,                                            │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │ collection_params = {                                                                        │ │
│ │                     │   'timeout': 300,                                                      │ │
│ │                     │   'optimizers_config': {'memmap_threshold': 25000000},                 │ │
│ │                     │   'hnsw_config': {'m': 16, 'ef_construct': 512}                        │ │
│ │                     }                                                                        │ │
│ │           dataset = <benchmark.dataset.Dataset object at 0x7da84cfdfc10>                     │ │
│ │              self = <engine.clients.qdrant.configure.QdrantConfigurator object at            │ │
│ │                     0x7da84cd65a20>                                                          │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ /usr/local/lib/python3.10/dist-packages/qdrant_client/qdrant_client.py:1824 in                   │
│ recreate_collection                                                                              │
│                                                                                                  │
│   1821 │   │   │   stacklevel=2,                                                                 │
│   1822 │   │   )                                                                                 │
│   1823 │   │                                                                                     │
│ ❱ 1824 │   │   return self._client.recreate_collection(                                          │
│   1825 │   │   │   collection_name=collection_name,                                              │
│   1826 │   │   │   vectors_config=vectors_config,                                                │
│   1827 │   │   │   shard_number=shard_number,                                                    │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │          collection_name = 'benchmark'                                                       │ │
│ │              hnsw_config = {'m': 16, 'ef_construct': 512}                                    │ │
│ │                init_from = None                                                              │ │
│ │                   kwargs = {}                                                                │ │
│ │          on_disk_payload = None                                                              │ │
│ │        optimizers_config = {'memmap_threshold': 25000000}                                    │ │
│ │      quantization_config = None                                                              │ │
│ │       replication_factor = None                                                              │ │
│ │                     self = <qdrant_client.qdrant_client.QdrantClient object at               │ │
│ │                            0x7da84cd65b10>                                                   │ │
│ │             shard_number = None                                                              │ │
│ │          sharding_method = None                                                              │ │
│ │    sparse_vectors_config = None                                                              │ │
│ │                  timeout = 300                                                               │ │
│ │           vectors_config = VectorParams(                                                     │ │
│ │                            │   size=512,                                                     │ │
│ │                            │   distance=<Distance.COSINE: 'Cosine'>,                         │ │
│ │                            │   hnsw_config=None,                                             │ │
│ │                            │   quantization_config=None,                                     │ │
│ │                            │   on_disk=None,                                                 │ │
│ │                            │   datatype=None                                                 │ │
│ │                            )                                                                 │ │
│ │               wal_config = None                                                              │ │
│ │ write_consistency_factor = None                                                              │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ /usr/local/lib/python3.10/dist-packages/qdrant_client/qdrant_remote.py:2288 in                   │
│ recreate_collection                                                                              │
│                                                                                                  │
│   2285 │   │   sharding_method: Optional[types.ShardingMethod] = None,                           │
│   2286 │   │   **kwargs: Any,                                                                    │
│   2287 │   ) -> bool:                                                                            │
│ ❱ 2288 │   │   self.delete_collection(collection_name, timeout=timeout)                          │
│   2289 │   │                                                                                     │
│   2290 │   │   return self.create_collection(                                                    │
│   2291 │   │   │   collection_name=collection_name,                                              │
│                                                                                                  │
│ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
│ │          collection_name = 'benchmark'                                                       │ │
│ │              hnsw_config = {'m': 16, 'ef_construct': 512}                                    │ │
│ │                init_from = None                                                              │ │
│ │                   kwargs = {}                                                                │ │
│ │          on_disk_payload = None                                                              │ │
│ │        optimizers_config = {'memmap_threshold': 25000000}                                    │ │
│ │      quantization_config = None                                                              │ │
│ │       replication_factor = None                                                              │ │
│ │                     self = <qdrant_client.qdrant_remote.QdrantRemote object at               │ │
│ │                            0x7da84cd65210>                                                   │ │
│ │             shard_number = None                                                              │ │
│ │          sharding_method = None                                                              │ │
│ │    sparse_vectors_config = None                                                              │ │
│ │                  timeout = 300                                                               │ │
│ │           vectors_config = VectorParams(                                                     │ │
│ │                            │   size=512,                                                     │ │
│ │                            │   distance=<Distance.COSINE: 'Cosine'>,                         │ │
│ │                            │   hnsw_config=None,                                             │ │
│ │                            │   quantization_config=None,                                     │ │
│ │                            │   on_disk=None,                                                 │ │
│ │                            │   datatype=None                                                 │ │
│ │                            )                                                                 │ │
│ │               wal_config = None                                                              │ │
│ │ write_consistency_factor = None                                                              │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                                  │
│ /usr/local/lib/python3.10/dist-packages/qdrant_client/qdrant_remote.py:2156 in delete_collection │
│                                                                                                  │
│   2153 │   │   │   │   timeout=self._timeout,                                                    │
│   2154 │   │   │   ).result                                                                      │
│   2155 │   │                                                                                     │
│ ❱ 2156 │   │   result: Optional[bool] = self.http.collections_api.delete_collection(             │
│   2157 │   │   │   collection_name, timeout=timeout                                              │
│   2158 │   │   ).result                                                                          │
│   2159 │   │   assert result is not None, "Delete collection returned None"                      │
│                                                                                                  │
│ ╭─────────────────────────────────────── locals ────────────────────────────────────────╮        │
│ │ collection_name = 'benchmark'                                                         │        │
│ │          kwargs = {}                                                                  │        │
│ │            self = <qdrant_client.qdrant_remote.QdrantRemote object at 0x7da84cd65210> │        │
│ │         timeout = 300                                                                 │        │
│ ╰───────────────────────────────────────────────────────────────────────────────────────╯        │
│ 

Expected behaviour:

I expected the the timeout to happen but not to return a 500 error from the DB server.