qdrant / vector-db-benchmark

Framework for benchmarking vector search engines
https://qdrant.tech/benchmarks/
Apache License 2.0
248 stars 68 forks source link

OpenSearch search run should handle rate-limiting / 429 HTTP errors #142

Open filipecosta90 opened 2 months ago

filipecosta90 commented 2 months ago

Ideally we should handle it and retry/fallback.

Sample error at query time:

6038it [01:17, 77.75it/s]
Experiment opensearch-m-16-ef-128 - glove-100-angular interrupted
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/root/vector-db-benchmark/engine/base_client/search.py", line 46, in _search_one
    search_res = cls.search_one(query, top)
  File "/root/vector-db-benchmark/engine/clients/opensearch/search.py", line 52, in search_one
    res = cls.client.search(
  File "/usr/local/lib/python3.8/dist-packages/opensearchpy/client/utils.py", line 181, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/opensearchpy/client/__init__.py", line 1742, in search
    return self.transport.perform_request(
  File "/usr/local/lib/python3.8/dist-packages/opensearchpy/transport.py", line 448, in perform_request
    raise e
  File "/usr/local/lib/python3.8/dist-packages/opensearchpy/transport.py", line 409, in perform_request
    status, headers_response, data = connection.perform_request(
  File "/usr/local/lib/python3.8/dist-packages/opensearchpy/connection/http_urllib3.py", line 290, in perform_request
    self._raise_error(
  File "/usr/local/lib/python3.8/dist-packages/opensearchpy/connection/base.py", line 316, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
opensearchpy.exceptions.TransportError: TransportError(429, '429 Too Many Requests /bench/_search')
"""

Sample error at ingestion

opensearchpy.exceptions.TransportError: TransportError(429, '429 Too Many Requests /bench/_bulk')
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/root/vector-db-benchmark/engine/base_client/upload.py", line 89, in _upload_batch
    cls.upload_batch(batch)
  File "/root/vector-db-benchmark/engine/clients/opensearch/upload.py", line 43, in upload_batch
    cls.client.bulk(
  File "/usr/local/lib/python3.10/dist-packages/opensearchpy/client/utils.py", line 181, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/opensearchpy/client/__init__.py", line 462, in bulk
    return self.transport.perform_request(
  File "/usr/local/lib/python3.10/dist-packages/opensearchpy/transport.py", line 448, in perform_request
    raise e
  File "/usr/local/lib/python3.10/dist-packages/opensearchpy/transport.py", line 409, in perform_request
    status, headers_response, data = connection.perform_request(
  File "/usr/local/lib/python3.10/dist-packages/opensearchpy/connection/http_urllib3.py", line 290, in perform_request
    self._raise_error(
  File "/usr/local/lib/python3.10/dist-packages/opensearchpy/connection/base.py", line 316, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
opensearchpy.exceptions.TransportError: TransportError(429, '429 Too Many Requests /bench/_bulk')
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "/root/vector-db-benchmark/run.py", line 91, in <module>
    app()

  File "/root/vector-db-benchmark/run.py", line 86, in run
    raise e

  File "/root/vector-db-benchmark/run.py", line 59, in run
    client.run_experiment(

  File "/root/vector-db-benchmark/engine/base_client/client.py", line 108, in run_experiment
    upload_stats = self.uploader.upload(

  File "/root/vector-db-benchmark/engine/base_client/upload.py", line 56, in upload
    latencies = list(

  File "/usr/lib/python3.10/multiprocessing/pool.py", line 873, in next
    raise value

opensearchpy.exceptions.TransportError: TransportError(429, '429 Too Many Requests /bench/_bulk')
filipecosta90 commented 2 months ago

139 addresses this.

Here's a sample output of backoff working and allowing the benchmark to end:

(...)
7681it [02:57, 70.40it/s]Backing off OpenSearch query for 0.6122897890276054 seconds after 1 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.8117441326245718 seconds after 2 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.06568945534606019 seconds after 1 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.13053787177251708 seconds after 1 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 1.8277456945680721 seconds after 3 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.5903717683351071 seconds after 2 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.47413666804382415 seconds after 1 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 1.6831794552377137 seconds after 3 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.5819287032607211 seconds after 1 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 1.6308955940439915 seconds after 3 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.638039971022516 seconds after 2 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 1.9676344004968502 seconds after 3 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.9125294977692109 seconds after 2 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.8954642193746257 seconds after 1 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.5515543395984353 seconds after 2 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.6622614886104707 seconds after 1 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.5529138754482195 seconds after 3 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 2.0092091141694057 seconds after 3 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.45178998485770316 seconds after 1 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 1.508148104467346 seconds after 2 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.6139734474785659 seconds after 1 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.6482759247107287 seconds after 1 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 1.8431078878250393 seconds after 2 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
Backing off OpenSearch query for 0.3298877080829401 seconds after 2 tries due to TransportError(429, '429 Too Many Requests /bench/_search')
10000it [03:51, 43.13it/s]
Experiment stage: Done