Open kkpssr opened 1 month ago
/assign @kkpssr /unassign
- Search performance could be a)impacted by resource competition underlying; b)the growing data amount by insert/bulkinsert; c)the new segments compacted or not to a larger one; etc.
- As to the performance FAQ, it is about the sift dataset, which you are running with binary vectors, so they are different.
to address your performance issue, could please offer
- the full milvus logs and metric screenshots
- how many cpu are you using for query nodes? are they running exclusively?
- what sdk are you using, and what is the version
/assign @kkpssr /unassign
@kkpssr the log you attached indicates that there index tasks pending for building index, which means that there are segments has not been indexed, Milvus resorts to brute-force search on the raw data—drastically increasing query time.
@kkpssr the log you attached indicates that there index tasks pending for building index, which means that there are segments has not been indexed, Milvus resorts to brute-force search on the raw data—drastically increasing query time.
so bulk-insert operation always return insert success immediately but index-built operation is not finished?
@kkpssr you could get the bulk insert task state. In milvus 2.4, if the task state is completed, it means the data of this task was indexed. please check https://milvus.io/api-reference/pymilvus/v2.4.x/ORM/utility/get_bulk_insert_state.md
@kkpssr you could get the bulk insert task state. In milvus 2.4, if the task state is completed, it means the data of this task was indexed. please check https://milvus.io/api-reference/pymilvus/v2.4.x/ORM/utility/get_bulk_insert_state.md
and how long it will take to build BIN_IVF index for 256 dim 32 bytes 50m data in normal?
it based on how many index resources you have.
each index build should take no more than 10 minutes.
you need to check
Bird watcher can help you to get those information
todo: add SIMD speedup for binary indices
todo: add SIMD speedup for binary indices
I though faiss already supported simd for binary?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
Is there an existing issue for this?
Environment
Current Behavior
the performance is very instability, sometime cost 800ms or more, sometimes cost 300ms in 2m data ,but is far away from example in FAQ shows only cost 200ms when nq=1000 in 50m data.my search pipeline is search 400-500 features first and then bulkinsert into collections.
Expected Behavior
reached performance showed in FAQ(200ms with nq=1000 in 50m data)
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response