This pull request introduces enhancements to the Qdrant client configuration on issues observed on large vector count ( > 40M ). The key changes include:
Max Optimization Threads:
Added the ability to specify the max_optimization_threads parameter within the post_upload() method, enabling better control over resource utilization during optimization tasks. By looking at the actual usage of qdrant cloud instances during ingestion and indexing i could see that we were not using the entire deployment VCPUs. Adding the ability control the ammount of optimization threads should give us better usage. By default it will follow qdrant's Optimizer config base model ("If null - have no limit and choose dynamically to saturate CPU")
confirmation that indeed this change is effective, using a 1M vector sample dataset :
"None" - meaning no limit and choose dynamically to saturate CPU
1000000it [00:26, 38223.74it/s]
Upload time: 26.269824364921078
Total import time: 36.38025348598603
QDRANT_MAX_OPTIMIZATION_THREADS=1 - The current value of master
1000000it [00:26, 37504.10it/s]
Upload time: 26.770209033973515
Total import time: 76.93960217398126
Exponential Backoff:
Implemented an exponential backoff mechanism to handle retries, improving robustness and error handling during the recreate_collection operation ( Fixes #162 )
This pull request introduces enhancements to the Qdrant client configuration on issues observed on large vector count ( > 40M ). The key changes include:
max_optimization_threads
parameter within thepost_upload()
method, enabling better control over resource utilization during optimization tasks. By looking at the actual usage of qdrant cloud instances during ingestion and indexing i could see that we were not using the entire deployment VCPUs. Adding the ability control the ammount of optimization threads should give us better usage. By default it will follow qdrant's Optimizer config base model ("If null - have no limit and choose dynamically to saturate CPU")confirmation that indeed this change is effective, using a 1M vector sample dataset :
"None" - meaning no limit and choose dynamically to saturate CPU
QDRANT_MAX_OPTIMIZATION_THREADS=1 - The current value of master