Open NicoYuan1986 opened 2 months ago
- Milvus version: master-20240823-e8e3544a-amd64 - Deployment mode(standalone or cluster): cluster - MQ type(rocksmq, pulsar or kafka): pulsar - SDK version(e.g. pymilvus v2.0.0rc2): - OS(Ubuntu or CentOS): - CPU/Memory: - GPU: - Others:
collection.is_empty sometimes get wrong result.
[2024-08-24T13:52:14.275Z] ___ TestCollectionSearch.test_search_HNSW_index_with_min_ef[False-10-512-4] ____ ... [2024-08-24T13:52:14.276Z] collection_w = <base.collection_wrapper.ApiCollectionWrapper object at 0x7fb648013bb0> [2024-08-24T13:52:14.276Z] [2024-08-24T13:52:14.276Z] def init_collection_general(self, prefix="test", insert_data=False, nb=ct.default_nb, [2024-08-24T13:52:14.276Z] partition_num=0, is_binary=False, is_all_data_type=False, [2024-08-24T13:52:14.276Z] auto_id=False, dim=ct.default_dim, is_index=True, [2024-08-24T13:52:14.276Z] primary_field=ct.default_int64_field_name, is_flush=True, name=None, [2024-08-24T13:52:14.276Z] enable_dynamic_field=False, with_json=True, random_primary_key=False, [2024-08-24T13:52:14.276Z] multiple_dim_array=[], is_partition_key=None, vector_data_type="FLOAT_VECTOR", [2024-08-24T13:52:14.276Z] **kwargs): [2024-08-24T13:52:14.276Z] """ [2024-08-24T13:52:14.276Z] target: create specified collections [2024-08-24T13:52:14.276Z] method: 1. create collections (binary/non-binary, default/all data type, auto_id or not) [2024-08-24T13:52:14.276Z] 2. create partitions if specified [2024-08-24T13:52:14.276Z] 3. insert specified (binary/non-binary, default/all data type) data [2024-08-24T13:52:14.276Z] into each partition if any [2024-08-24T13:52:14.276Z] 4. not load if specifying is_index as True [2024-08-24T13:52:14.276Z] expected: return collection and raw data, insert ids [2024-08-24T13:52:14.276Z] """ [2024-08-24T13:52:14.276Z] log.info("Test case of search interface: initialize before test case") [2024-08-24T13:52:14.276Z] if not self.connection_wrap.has_connection(alias=DefaultConfig.DEFAULT_USING)[0]: [2024-08-24T13:52:14.276Z] self._connect() [2024-08-24T13:52:14.276Z] collection_name = cf.gen_unique_str(prefix) [2024-08-24T13:52:14.276Z] if name is not None: [2024-08-24T13:52:14.276Z] collection_name = name [2024-08-24T13:52:14.276Z] vectors = [] [2024-08-24T13:52:14.276Z] binary_raw_vectors = [] [2024-08-24T13:52:14.276Z] insert_ids = [] [2024-08-24T13:52:14.276Z] time_stamp = 0 [2024-08-24T13:52:14.276Z] # 1 create collection [2024-08-24T13:52:14.276Z] default_schema = cf.gen_default_collection_schema(auto_id=auto_id, dim=dim, primary_field=primary_field, [2024-08-24T13:52:14.276Z] enable_dynamic_field=enable_dynamic_field, [2024-08-24T13:52:14.276Z] with_json=with_json, multiple_dim_array=multiple_dim_array, [2024-08-24T13:52:14.276Z] is_partition_key=is_partition_key, [2024-08-24T13:52:14.276Z] vector_data_type=vector_data_type) [2024-08-24T13:52:14.276Z] if is_binary: [2024-08-24T13:52:14.276Z] default_schema = cf.gen_default_binary_collection_schema(auto_id=auto_id, dim=dim, [2024-08-24T13:52:14.276Z] primary_field=primary_field) [2024-08-24T13:52:14.276Z] if vector_data_type == ct.sparse_vector: [2024-08-24T13:52:14.276Z] default_schema = cf.gen_default_sparse_schema(auto_id=auto_id, primary_field=primary_field, [2024-08-24T13:52:14.276Z] enable_dynamic_field=enable_dynamic_field, [2024-08-24T13:52:14.276Z] with_json=with_json, [2024-08-24T13:52:14.276Z] multiple_dim_array=multiple_dim_array) [2024-08-24T13:52:14.276Z] if is_all_data_type: [2024-08-24T13:52:14.276Z] default_schema = cf.gen_collection_schema_all_datatype(auto_id=auto_id, dim=dim, [2024-08-24T13:52:14.276Z] primary_field=primary_field, [2024-08-24T13:52:14.276Z] enable_dynamic_field=enable_dynamic_field, [2024-08-24T13:52:14.276Z] with_json=with_json, [2024-08-24T13:52:14.276Z] multiple_dim_array=multiple_dim_array) [2024-08-24T13:52:14.276Z] log.info("init_collection_general: collection creation") [2024-08-24T13:52:14.276Z] collection_w = self.init_collection_wrap(name=collection_name, schema=default_schema, **kwargs) [2024-08-24T13:52:14.276Z] vector_name_list = cf.extract_vector_field_name_list(collection_w) [2024-08-24T13:52:14.276Z] # 2 add extra partitions if specified (default is 1 partition named "_default") [2024-08-24T13:52:14.276Z] if partition_num > 0: [2024-08-24T13:52:14.276Z] [get_env_variable] failed to get environment variables : 'CI_LOG_PATH', use default path : /tmp/ci_logs [2024-08-24T13:52:14.276Z] [create_path] folder(/tmp/ci_logs) is not exist. [2024-08-24T13:52:14.276Z] [create_path] create path now... [2024-08-24T13:52:14.276Z] cf.gen_partitions(collection_w, partition_num) [2024-08-24T13:52:14.276Z] # 3 insert data if specified [2024-08-24T13:52:14.276Z] if insert_data: [2024-08-24T13:52:14.276Z] collection_w, vectors, binary_raw_vectors, insert_ids, time_stamp = \ [2024-08-24T13:52:14.276Z] cf.insert_data(collection_w, nb, is_binary, is_all_data_type, auto_id=auto_id, [2024-08-24T13:52:14.276Z] dim=dim, enable_dynamic_field=enable_dynamic_field, with_json=with_json, [2024-08-24T13:52:14.276Z] random_primary_key=random_primary_key, multiple_dim_array=multiple_dim_array, [2024-08-24T13:52:14.276Z] primary_field=primary_field, vector_data_type=vector_data_type) [2024-08-24T13:52:14.276Z] if is_flush: [2024-08-24T13:52:14.276Z] > assert collection_w.is_empty is False [2024-08-24T13:52:14.276Z] E AssertionError [2024-08-24T13:52:14.276Z] [2024-08-24T13:52:14.276Z] ../base/client_base.py:290: AssertionError [2024-08-24T13:52:14.276Z] ------------------------------ Captured log call ------------------------------- ... [2024-08-24T13:52:14.277Z] [2024-08-24 13:30:53 - DEBUG - ci_test]: (api_request) : [Collection.insert] args: [[{'float': 2500.0, 'varchar': '2500', 'json_field': {'number': 2500, 'float': 2500.0}, 'float_vector': [0.2555832088574461, 0.18711933125594346, 0.1027027432418517, 0.25024003965536257, 0.05308787333474047, 0.20607145507900107, 0.3754486942435134, 0.2404584083886309, 0.3446717130805288, 0.194732842......, kwargs: {'timeout': 180} (api_request.py:62) [2024-08-24T13:52:14.277Z] [2024-08-24 13:30:53 - DEBUG - ci_test]: (api_response) : (insert count: 2500, delete count: 0, upsert count: 0, timestamp: 452068967420788737, success count: 2500, err count: 0 (api_request.py:37) [2024-08-24T13:52:14.277Z] [2024-08-24 13:30:53 - INFO - ci_test]: inserted 2500 data into collection search_collection_fwwxQieH (common_func.py:1841) [2024-08-24T13:52:14.277Z] [2024-08-24 13:30:53 - DEBUG - ci_test]: (api_request) : [Collection.flush] args: [], kwargs: {'timeout': 180} (api_request.py:62) [2024-08-24T13:52:14.277Z] [2024-08-24 13:30:56 - DEBUG - ci_test]: (api_response) : None (api_request.py:37)
collection.is_empty is False
No response
link: https://qa-jenkins.milvus.io/blue/organizations/jenkins/E2E%20Test/detail/E2E%20Test/825/pipeline/ log: artifacts-e2e-test-825-server-logs.tar.gz
/assign @longjiquan /unassign
@NicoYuan1986 can we still reproce it?
Is there an existing issue for this?
Environment
Current Behavior
collection.is_empty sometimes get wrong result.
Expected Behavior
collection.is_empty is False
Steps To Reproduce
No response
Milvus Log
link: https://qa-jenkins.milvus.io/blue/organizations/jenkins/E2E%20Test/detail/E2E%20Test/825/pipeline/ log: artifacts-e2e-test-825-server-logs.tar.gz
Anything else?
No response