qdrant / qdrant-client

Python client for Qdrant vector search engine
https://qdrant.tech
Apache License 2.0
733 stars 117 forks source link

Huge CPU usage of qdrant client when deserializing payload #714

Open GDegrove opened 1 month ago

GDegrove commented 1 month ago

Title: High CPU Usage with Qdrant Async Client for GRPC and REST Endpoints

Description:

We are experiencing significant CPU usage spikes when enabling the Qdrant async client for both GRPC and REST endpoints. By utilizing a profiling tool (Parca), we have identified that a considerable portion of CPU time is being consumed by payload deserialization.

This issue is particularly problematic for applications like recommendation engines that rely on Qdrant as a vector database, as it results in a substantial increase in CPU requirements for our application.

Steps to Reproduce:

  1. Enable Qdrant async client for both GRPC and REST endpoints.
  2. Run the application and perform standard operations.
  3. Monitor CPU usage and analyze with a profiling tool (e.g., Parca).

Expected Behavior:

Actual Behavior:

Environment:

Proposed Solution:

To mitigate the high CPU usage, we propose utilizing orjson for JSON serialization and deserialization. orjson is known for its performance benefits compared to the standard JSON library in Python. By replacing the current deserialization process with orjson, we anticipate a reduction in CPU overhead.

Additional Context: This CPU overhead is posing a challenge in deploying Qdrant in resource-constrained environments. Any guidance or fixes to address this issue would be greatly appreciated.

joein commented 1 month ago

hi @GDegrove

just to be sure, could you also provide the version of pydantic you're using?

GDegrove commented 1 month ago

hi @GDegrove

just to be sure, could you also provide the version of pydantic you're using?

We are using pydantic 2, for exactitude version = "2.8.2"

joein commented 1 month ago

We'll try to look into it, thanks for pointing it out