Open ThreadDao opened 2 hours ago
Currently, a stats task is performed before building the index for segments. Due to the large number of segments, it appears on the monitoring system that the number of unissued index tasks is slowly increasing. In fact, this is because the stats tasks are completing slowly. The reason for the slow completion of the stats tasks is that the connection to MinIO/S3 on the index node is taking too long (more than 10 seconds), causing the datacoord to think that the task assignment has failed. However, the index node has already cached the task. When datacoord reassigns the task, it does not first clear the cache on the index node, resulting in the index node reporting that the task already exists during reassignment. This blocks the task's progress, which explains why tasks execute quickly after the index node is restarted. And pull request #36371 will fix it.
Is there an existing issue for this?
Environment
Current Behavior
test steps
Expected Behavior
No response
Steps To Reproduce
Milvus Log
Anything else?
No response