Closed zhuwenxing closed 1 year ago
@congqixia
could you please have a look first? Thanks.
It also reproduced in 2.2.0-20230117-1a819f2d
[2023-01-17T13:01:02.428Z] <name>: deploy_test_index_type_HNSW_is_compacted_not_compacted_segment_status_only_growing_is_string_indexed_not_string_indexed_replica_number_2_is_deleted_is_deleted_data_size_3000
[2023-01-17T13:01:02.428Z] <partitions>: [{"name": "_default", "collection_name": "deploy_test_index_type_HNSW_is_com...... (api_request.py:31)
[2023-01-17T13:01:02.428Z] [2023-01-17 12:54:49 - DEBUG - ci_test]: (api_request) : [Collection.flush] args: [], kwargs: {'timeout': 120} (api_request.py:56)
[2023-01-17T13:01:02.428Z] [2023-01-17 12:54:53 - DEBUG - ci_test]: (api_response) : None (api_request.py:31)
[2023-01-17T13:01:02.428Z] [2023-01-17 12:54:53 - INFO - ci_test]: inserted 3000 data into collection deploy_test_index_type_HNSW_is_compacted_not_compacted_segment_status_only_growing_is_string_indexed_not_string_indexed_replica_number_2_is_deleted_is_deleted_data_size_3000 (common_func.py:740)
[2023-01-17T13:01:02.428Z] [2023-01-17 12:54:53 - DEBUG - ci_test]: (api_request) : [Collection.insert] args: [ int64 float varchar float_vector
[2023-01-17T13:01:02.428Z] 0 0 0.0 0 [0.06622585469149442, 0.05149480071923069, 0.0...
[2023-01-17T13:01:02.428Z] 1 1 1.0 1 [0.009027094743335583, 0.14389109021241628, 0....
[2023-01-17T13:01:02.428Z] 2 2 2.0 2 [0.003013331986805464, 0.15043796......, kwargs: {'timeout': 120} (api_request.py:56)
[2023-01-17T13:01:02.428Z] [2023-01-17 12:54:53 - DEBUG - ci_test]: (api_response) : (insert count: 3000, delete count: 0, upsert count: 0, timestamp: 438818594807611393, success count: 3000, err count: 0) (api_request.py:31)
[2023-01-17T13:01:02.428Z] [2023-01-17 12:55:01 - INFO - ci_test]: index info: [] (test_action_second_deployment.py:58)
[2023-01-17T13:01:02.428Z] [2023-01-17 12:55:09 - ERROR - pymilvus.decorators]: RPC error: [query], <MilvusException: (code=1, message=fail to query on all shard leaders, err=fail to Query, QueryNode ID = 18, reason=Query 19 failed, reason err err: rpc error: code = Canceled desc = context canceled
[2023-01-17T13:01:02.428Z] , /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace
[2023-01-17T13:01:02.429Z] /go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:277 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call
[2023-01-17T13:01:02.429Z] /go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:268 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).Query
[2023-01-17T13:01:02.429Z] /go/src/github.com/milvus-io/milvus/internal/querynode/shard_cluster.go:1076 github.com/milvus-io/milvus/internal/querynode.(*ShardCluster).Query.func2
[2023-01-17T13:01:02.429Z] /usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit
[2023-01-17T13:01:02.429Z] )>, <Time:{'RPC start': '2023-01-17 12:55:08.788101', 'RPC error': '2023-01-17 12:55:09.117405'}> (decorators.py:108)
failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/deploy_test_kafka_for_release_cron/detail/deploy_test_kafka_for_release_cron/228/pipeline log:
artifacts-kafka-cluster-reinstall-228-server-second-deployment-logs.tar.gz artifacts-kafka-cluster-reinstall-228-server-first-deployment-logs.tar.gz artifacts-kafka-cluster-reinstall-228-pytest-logs.tar.gz
It also reproduced in
2.2.0-20230117-1a819f2d
[2023-01-17T13:01:02.428Z] <name>: deploy_test_index_type_HNSW_is_compacted_not_compacted_segment_status_only_growing_is_string_indexed_not_string_indexed_replica_number_2_is_deleted_is_deleted_data_size_3000 [2023-01-17T13:01:02.428Z] <partitions>: [{"name": "_default", "collection_name": "deploy_test_index_type_HNSW_is_com...... (api_request.py:31) [2023-01-17T13:01:02.428Z] [2023-01-17 12:54:49 - DEBUG - ci_test]: (api_request) : [Collection.flush] args: [], kwargs: {'timeout': 120} (api_request.py:56) [2023-01-17T13:01:02.428Z] [2023-01-17 12:54:53 - DEBUG - ci_test]: (api_response) : None (api_request.py:31) [2023-01-17T13:01:02.428Z] [2023-01-17 12:54:53 - INFO - ci_test]: inserted 3000 data into collection deploy_test_index_type_HNSW_is_compacted_not_compacted_segment_status_only_growing_is_string_indexed_not_string_indexed_replica_number_2_is_deleted_is_deleted_data_size_3000 (common_func.py:740) [2023-01-17T13:01:02.428Z] [2023-01-17 12:54:53 - DEBUG - ci_test]: (api_request) : [Collection.insert] args: [ int64 float varchar float_vector [2023-01-17T13:01:02.428Z] 0 0 0.0 0 [0.06622585469149442, 0.05149480071923069, 0.0... [2023-01-17T13:01:02.428Z] 1 1 1.0 1 [0.009027094743335583, 0.14389109021241628, 0.... [2023-01-17T13:01:02.428Z] 2 2 2.0 2 [0.003013331986805464, 0.15043796......, kwargs: {'timeout': 120} (api_request.py:56) [2023-01-17T13:01:02.428Z] [2023-01-17 12:54:53 - DEBUG - ci_test]: (api_response) : (insert count: 3000, delete count: 0, upsert count: 0, timestamp: 438818594807611393, success count: 3000, err count: 0) (api_request.py:31) [2023-01-17T13:01:02.428Z] [2023-01-17 12:55:01 - INFO - ci_test]: index info: [] (test_action_second_deployment.py:58) [2023-01-17T13:01:02.428Z] [2023-01-17 12:55:09 - ERROR - pymilvus.decorators]: RPC error: [query], <MilvusException: (code=1, message=fail to query on all shard leaders, err=fail to Query, QueryNode ID = 18, reason=Query 19 failed, reason err err: rpc error: code = Canceled desc = context canceled [2023-01-17T13:01:02.428Z] , /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace [2023-01-17T13:01:02.429Z] /go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:277 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call [2023-01-17T13:01:02.429Z] /go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:268 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).Query [2023-01-17T13:01:02.429Z] /go/src/github.com/milvus-io/milvus/internal/querynode/shard_cluster.go:1076 github.com/milvus-io/milvus/internal/querynode.(*ShardCluster).Query.func2 [2023-01-17T13:01:02.429Z] /usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit [2023-01-17T13:01:02.429Z] )>, <Time:{'RPC start': '2023-01-17 12:55:08.788101', 'RPC error': '2023-01-17 12:55:09.117405'}> (decorators.py:108)
failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/deploy_test_kafka_for_release_cron/detail/deploy_test_kafka_for_release_cron/228/pipeline log:
artifacts-kafka-cluster-reinstall-228-server-second-deployment-logs.tar.gz artifacts-kafka-cluster-reinstall-228-server-first-deployment-logs.tar.gz artifacts-kafka-cluster-reinstall-228-pytest-logs.tar.gz
The error is raised by the query
[2023-01-17T13:01:02.427Z] @retry_on_rpc_failure()
[2023-01-17T13:01:02.427Z] def query(self, collection_name, expr, output_fields=None, partition_names=None, timeout=None, **kwargs):
[2023-01-17T13:01:02.427Z] if output_fields is not None and not isinstance(output_fields, (list,)):
[2023-01-17T13:01:02.427Z] raise ParamError(message="Invalid query format. 'output_fields' must be a list")
[2023-01-17T13:01:02.427Z] collection_schema = kwargs.get("schema", None)
[2023-01-17T13:01:02.427Z] if not collection_schema:
[2023-01-17T13:01:02.427Z] collection_schema = self.describe_collection(collection_name, timeout)
[2023-01-17T13:01:02.427Z] consistency_level = collection_schema["consistency_level"]
[2023-01-17T13:01:02.427Z] # overwrite the consistency level defined when user created the collection
[2023-01-17T13:01:02.427Z] consistency_level = get_consistency_level(kwargs.get("consistency_level", consistency_level))
[2023-01-17T13:01:02.427Z]
[2023-01-17T13:01:02.427Z] ts_utils.construct_guarantee_ts(consistency_level, collection_name, kwargs)
[2023-01-17T13:01:02.427Z] request = Prepare.query_request(collection_name, expr, output_fields, partition_names, **kwargs)
[2023-01-17T13:01:02.427Z]
[2023-01-17T13:01:02.427Z] future = self._stub.Query.future(request, timeout=timeout)
[2023-01-17T13:01:02.427Z] response = future.result()
[2023-01-17T13:01:02.427Z] if response.status.error_code == Status.EMPTY_COLLECTION:
[2023-01-17T13:01:02.427Z] return []
[2023-01-17T13:01:02.427Z] if response.status.error_code != Status.SUCCESS:
[2023-01-17T13:01:02.427Z] > raise MilvusException(response.status.error_code, response.status.reason)
[2023-01-17T13:01:02.427Z] E pymilvus.exceptions.MilvusException: <MilvusException: (code=1, message=fail to query on all shard leaders, err=fail to Query, QueryNode ID = 18, reason=Query 19 failed, reason err err: rpc error: code = Canceled desc = context canceled
[2023-01-17T13:01:02.427Z] E , /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace
[2023-01-17T13:01:02.427Z] E /go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:277 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call
[2023-01-17T13:01:02.427Z] E /go/src/github.com/milvus-io/milvus/internal/distributed/querynode/client/client.go:268 github.com/milvus-io/milvus/internal/distributed/querynode/client.(*Client).Query
[2023-01-17T13:01:02.427Z] E /go/src/github.com/milvus-io/milvus/internal/querynode/shard_cluster.go:1076 github.com/milvus-io/milvus/internal/querynode.(*ShardCluster).Query.func2
[2023-01-17T13:01:02.427Z] E /usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit
[2023-01-17T13:01:02.427Z] E )>
[2023-01-17T13:01:02.427Z]
[2023-01-17T13:01:02.427Z] /usr/local/lib/python3.8/dist-packages/pymilvus/client/grpc_handler.py:885: MilvusException
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
Is there an existing issue for this?
Environment
Current Behavior
Expected Behavior
all test cases passed
Steps To Reproduce
No response
Milvus Log
deploy task: reinstall old image tag: v2.2.0 new image tag: master-20230111-9f809599 failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/deploy_test_kafka_cron/detail/deploy_test_kafka_cron/202/pipeline
log: artifacts-kafka-cluster-reinstall-202-server-second-deployment-logs.tar.gz artifacts-kafka-cluster-reinstall-202-server-first-deployment-logs.tar.gz artifacts-kafka-cluster-reinstall-202-pytest-logs.tar.gz
Anything else?
No response