milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.36k stars 2.91k forks source link

[Bug]: No error is returned when searching an empty collection with a vector that has a different schema dim #33637

Open ThreadDao opened 5 months ago

ThreadDao commented 5 months ago

Is there an existing issue for this?

Environment

- Milvus version: 2.4 and master
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus 2.5.0rc31
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

  1. create collection with binary vector dim 128
  2. index and load
  3. search with vector dim:512 and no error was returned

    @pytest.mark.tags(CaseLabel.L2)
    def test_debug(self):
        """
        target: test insert binary with dim not match
        method: insert binary data dim not equal to schema
        expected: raise exception
        """
        c_name = cf.gen_unique_str(prefix)
        collection_w = self.init_collection_wrap(
            name=c_name, schema=default_binary_schema)
        print(collection_w.schema)
    
        # data = cf.gen_default_binary_list_data(nb=10, dim=128)[0]
        # collection_w.insert(data=data)
        # collection_w.flush()
    
        _index = {"index_type": "BIN_IVF_FLAT", "metric_type": "JACCARD", "params": {"nlist": 128}}
        collection_w.create_index(ct.default_binary_vec_field_name, index_params=_index)
        collection_w.load()
    
        search_vectors = cf.gen_binary_vectors(1, dim=512)[1]
        print(len(search_vectors[0]))
        search_params = {"metric_type": "JACCARD", "params": {"nprobe": 32}}
        res1 = collection_w.search(data=search_vectors, anns_field=ct.default_binary_vec_field_name,
                                   param=search_params, limit=10)[0]
  4. log
    <schema>: {'auto_id': False, 'description': '', 'fields': [{'name': 'int64', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float', 'description': '', 'type': <DataType.FLOAT: 10>}, ......  (api_request.py:37)
    {'auto_id': False, 'description': '', 'fields': [{'name': 'int64', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float', 'description': '', 'type': <DataType.FLOAT: 10>}, {'name': 'varchar', 'description': '', 'type': <DataType.VARCHAR: 21>, 'params': {'max_length': 65535}}, {'name': 'binary_vector', 'description': '', 'type': <DataType.BINARY_VECTOR: 100>, 'params': {'dim': 128}}], 'enable_dynamic_field': False}
    [2024-06-05 17:43:04,816 - DEBUG - ci_test]: (api_request)  : [Collection.create_index] args: ['binary_vector', {'index_type': 'BIN_IVF_FLAT', 'metric_type': 'JACCARD', 'params': {'nlist': 128}}, 1200], kwargs: {'index_name': ''} (api_request.py:62)
    [2024-06-05 17:43:05,357 - DEBUG - ci_test]: (api_response) : Status(code=0, message=)  (api_request.py:37)
    [2024-06-05 17:43:05,358 - DEBUG - ci_test]: (api_request)  : [Collection.load] args: [None, 1, 180], kwargs: {} (api_request.py:62)
    [2024-06-05 17:43:07,425 - DEBUG - ci_test]: (api_response) : None  (api_request.py:37)
    64
    [2024-06-05 17:43:07,427 - DEBUG - ci_test]: (api_request)  : [Collection.search] args: [[b'\x03\xfbA\xf6&DR\xf3L4\xdc\xb0\xe3\x11\xbc\xbdB\xf3\x02\xe8\xa8*\x9e\xc0\x0f\xfb^\x90\xcf\x97\xd7ZA\xf5\x93\x87EW\xa2\xdb>\x07f\x15\xb8\xb8\xd0\xa9\x83\xc6!(\x957\xab9\xb1Y>\xb3\xabO\xdd\x1e'], 'binary_vector', {'metric_type': 'JACCARD', 'params': {'nprobe': 32}}, 10, None, None, None, 180, -1], kwargs: {} (api_request.py:62)
    [2024-06-05 17:43:07,671 - DEBUG - ci_test]: (api_response) : data: ['[]'] , cost: 0  (api_request.py:37)

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

ThreadDao commented 5 months ago

@binbinlv Help to add relevant cases by pymilvus sdk. thanf you~

yanliang567 commented 5 months ago

/assign @czs007 is it something by design for now? any ideas of improvement? /unassign