milvus-io / pymilvus

Python SDK for Milvus.
Apache License 2.0
1.03k stars 326 forks source link

[Bug]: Slightly confusing message when doing a vector similiarty search with dimension mismatch. #1357

Open turnham opened 1 year ago

turnham commented 1 year ago

Is there an existing issue for this?

Describe the bug

This is very minor, but since it caused a bit of confusion, I figured I'd open it.

When doing a (non-binary) Vector Similarity search, if the length of the vector doesn't match the collection's dimension, the generated exception repeats the passed in vector 8 times.

This is happening in code here:

https://github.com/milvus-io/pymilvus/blob/1e3459b4198f931d960a09aff8e16609e515a56d/pymilvus/client/prepare.py#L323

It looks like that *8 is meant for the binary vector use case and not for the non-binary case.

Expected Behavior

The vector should be shown once in the error message

Steps/Code To Reproduce behavior

With milvus running at localhost, run the following which sets up a schema with dim=2, but then searches with a vector of dim=3

from pymilvus import connections,utility,FieldSchema,CollectionSchema,DataType,Collection

connection = connections.connect("default", host="localhost", port="19530")
fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
    FieldSchema(name="embedding_vector", dtype=DataType.FLOAT_VECTOR, dim=2)
]
test_collection = Collection("test", CollectionSchema(fields))
result = test_collection.search(
            data=[[0.1, 0.2, 0.3]],
            anns_field="embedding_vector",
            param = { "metric_type": "L2", "params": {"nprobe": 10} },
            limit=10
    )

The output is:

...
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pymilvus/client/prepare.py", line 346, in _prepare_placeholders
    raise ParamError(message=f"The dimension of query entities[{vectors[i]*8}] is different from schema [{dimension}]")
pymilvus.exceptions.ParamError: <ParamError: (code=1, message=The dimension of query entities[[0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.1, 0.2, 0.3]] is different from schema [2])>

Notice the passed in vector is repeated 8 times

Environment details

pymilvus==2.2.4
Milvus running standadlone via Docker Compose

Anything else?

No response

XuanYang-cn commented 1 year ago

good catch!