qdrant / qdrant-client

Python client for Qdrant vector search engine
https://qdrant.tech
Apache License 2.0
752 stars 121 forks source link

qdrant_client.http.exceptions.ResponseHandlingException: [Errno 9] Bad file descriptor #614

Open GuintherKovalski opened 5 months ago

GuintherKovalski commented 5 months ago

I'm making some parallel requests (~50/sec) to a qdrant server in a collection with the following config:

{
  "params": {
    "vectors": {
      "size": 2048,
      "distance": "Cosine",
      "on_disk": true
    },
    "shard_number": 1,
    "replication_factor": 1,
    "write_consistency_factor": 1,
    "on_disk_payload": true
  },
  "hnsw_config": {
    "m": 16,
    "ef_construct": 100,
    "full_scan_threshold": 10000,
    "max_indexing_threads": 0,
    "on_disk": false
  },
  "optimizer_config": {
    "deleted_threshold": 0.2,
    "vacuum_min_vector_number": 1000,
    "default_segment_number": 0,
    "max_segment_size": null,
    "memmap_threshold": null,
    "indexing_threshold": 20000,
    "flush_interval_sec": 5,
    "max_optimization_threads": 1
  },
  "wal_config": {
    "wal_capacity_mb": 32,
    "wal_segments_ahead": 0
  },
  "quantization_config": null
}

And I'm facing a huge delay in responses (4-40 seconds), and sometimes the following error:

search_result = self.client.search(
  File "/usr/local/lib/python3.8/dist-packages/qdrant_client/qdrant_client.py", line 324, in search
    return self._client.search(
  File "/usr/local/lib/python3.8/dist-packages/qdrant_client/qdrant_remote.py", line 454, in search
    search_result = self.http.points_api.search_points(
  File "/usr/local/lib/python3.8/dist-packages/qdrant_client/http/api/points_api.py", line 1197, in search_points
    return self._build_for_search_points(
  File "/usr/local/lib/python3.8/dist-packages/qdrant_client/http/api/points_api.py", line 536, in _build_for_search_points
    return self.api_client.request(
  File "/usr/local/lib/python3.8/dist-packages/qdrant_client/http/api_client.py", line 74, in request
    return self.send(request, type_)
  File "/usr/local/lib/python3.8/dist-packages/qdrant_client/http/api_client.py", line 91, in send
    response = self.middleware(request, self.send_inner)
  File "/usr/local/lib/python3.8/dist-packages/qdrant_client/http/api_client.py", line 200, in __call__
    return call_next(request)
  File "/usr/local/lib/python3.8/dist-packages/qdrant_client/http/api_client.py", line 103, in send_inner
    raise ResponseHandlingException(e)
qdrant_client.http.exceptions.ResponseHandlingException: [Errno 9] Bad file descriptor

i guess that the problem is caused by "on_disk_payload": true, and "on_disk": true, but i'm not sure if this is the cause, and if it is, there is any way to change it without having to recreate and re populate the collection?

GuintherKovalski commented 5 months ago

turns out that changing the configs to "on_disk_payload": false, and "on_disk": false, obviously, solved the delay problem. Using python client:

from qdrant_client import QdrantClient, models

collection_name="collection_name"

client = QdrantClient(url=url, api_key=api_key)

client.update_collection(
    collection_name=collection_name,collection_params=models.CollectionParamsDiff(on_disk_payload=False)
)

client.update_collection(
    collection_name=collection_name,
    vectors_config={
        "": models.VectorParamsDiff(on_disk=False)
        } 
    )

But anyway, i suppose that using the "on_disk" configuration should not lead to errors, so i'm leaving the issue open.