milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.51k stars 2.83k forks source link

[Bug]: syncTimestamp Failed:context canceled #36120

Closed zhoujiaqi1998 closed 1 week ago

zhoujiaqi1998 commented 1 week ago

Is there an existing issue for this?

Environment

- Milvus version:2.2.16
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):    pulsar
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus2.2.9
- OS(Ubuntu or CentOS): CentOS
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

Search syncTimestamp frequently fails. The mixcoord log is “get shard leaders request received”.

[2024/09/09 08:18:33.492 +00:00] [WARN] [proxy/impl.go:3064] ["Search failed to enqueue"] [traceID=505e2aacba182537] [error="syncTimestamp Failed:context canceled"] [role=proxy] [db=default] [collection=FAQ_ATRUST] [partitions="[]"] [dsl=] [len(PlaceholderGroup)=4108] [OutputFields="[text,answer,knowledge_id,relative_text]"] [search_params="[{\"key\":\"topk\",\"value\":\"100\"},{\"key\":\"metric_type\",\"value\":\"IP\"},{\"key\":\"params\",\"value\":\"{\\"nprobe\\":128}\"},{\"key\":\"round_decimal\",\"value\":\"-1\"},{\"key\":\"offset\",\"value\":\"0\"},{\"key\":\"ignore_growing\",\"value\":\"False\"},{\"key\":\"anns_field\",\"value\":\"embeddings\"}]"] [travel_timestamp=0] [guarantee_timestamp=1]

Expected Behavior

Search successful.

Steps To Reproduce

No response

Milvus Log

mixcoord: [2024/09/09 09:10:00.098 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073203187] [2024/09/09 09:10:00.132 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073201504] [2024/09/09 09:10:00.536 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073197118] [2024/09/09 09:10:00.726 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073199110] [2024/09/09 09:10:01.283 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073197118] [2024/09/09 09:10:01.885 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073197118] [2024/09/09 09:10:01.992 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073197602] [2024/09/09 09:10:02.051 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073197400] [2024/09/09 09:10:02.114 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073195945] [2024/09/09 09:10:02.151 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073202435] [2024/09/09 09:10:02.185 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073202373] [2024/09/09 09:10:02.294 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073203187] [2024/09/09 09:10:02.350 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073201504] [2024/09/09 09:10:02.909 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073199110] [2024/09/09 09:10:02.963 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073198834] [2024/09/09 09:10:03.017 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073203187] [2024/09/09 09:10:03.051 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073201504] [2024/09/09 09:10:08.346 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073197602] [2024/09/09 09:10:08.348 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073198173] [2024/09/09 09:10:08.381 +00:00] [INFO] [querycoordv2/services.go:871] ["get shard leaders request received"] [msgID=0] [collectionID=447193161073196027]

Proxy: [root@master-55378-0 log]# kubectl logs -n gpt-bak-milvus gpt-bak-milvus-cluster-proxy-77ccf5bb74-krltl | grep WARN [2024/09/09 09:10:02.557 +00:00] [WARN] [proxy/impl.go:3064] ["Search failed to enqueue"] [traceID=6d9832a0f08cd584] [error="syncTimestamp Failed:context canceled"] [role=proxy] [db=default] [collection=FAQ_YUNWAF] [partitions="[]"] [dsl=] [len(PlaceholderGroup)=4108] [OutputFields="[text,answer,knowledge_id,relative_text]"] [search_params="[{\"key\":\"topk\",\"value\":\"100\"},{\"key\":\"metric_type\",\"value\":\"IP\"},{\"key\":\"params\",\"value\":\"{\\"nprobe\\":128}\"},{\"key\":\"round_decimal\",\"value\":\"-1\"},{\"key\":\"offset\",\"value\":\"0\"},{\"key\":\"ignore_growing\",\"value\":\"False\"},{\"key\":\"anns_field\",\"value\":\"embeddings\"}]"] [travel_timestamp=0] [guarantee_timestamp=1] [2024/09/09 09:10:02.557 +00:00] [WARN] [proxy/impl.go:3064] ["Search failed to enqueue"] [traceID=2253ccebb28f9a2e] [error="syncTimestamp Failed:context canceled"] [role=proxy] [db=default] [collection=FAQ_WAF] [partitions="[]"] [dsl=] [len(PlaceholderGroup)=4108] [OutputFields="[text,answer,knowledge_id,relative_text]"] [search_params="[{\"key\":\"topk\",\"value\":\"100\"},{\"key\":\"metric_type\",\"value\":\"IP\"},{\"key\":\"params\",\"value\":\"{\\"nprobe\\":128}\"},{\"key\":\"round_decimal\",\"value\":\"-1\"},{\"key\":\"offset\",\"value\":\"0\"},{\"key\":\"ignore_growing\",\"value\":\"False\"},{\"key\":\"anns_field\",\"value\":\"embeddings\"}]"] [travel_timestamp=0] [guarantee_timestamp=1]

Anything else?

No response

zhoujiaqi1998 commented 1 week ago

Logs of etcd, mixcoord, proxy, and querynode: milvus_log.tar.gz

zhoujiaqi1998 commented 1 week ago

@yanliang567 Please help me.

zhoujiaqi1998 commented 1 week ago

Restore after increasing timeout.

zhoujiaqi1998 commented 1 week ago

Restore after increasing timeout.