Closed Aisuko closed 1 month ago
I experienced the same issue. In MilvusClient.search, type of param data is Union[Dict, List[Dict]], but it is List[List[float]] in the lower level .
So change data= data to data=[data] should have this issue fixed
MilvusClient.search
def search(
self,
collection_name: str,
data: Union[List[list], list],
filter: str = "",
limit: int = 10,
output_fields: Optional[List[str]] = None,
search_params: Optional[dict] = None,
timeout: Optional[float] = None,
partition_names: Optional[List[str]] = None,
anns_field: Optional[str] = None,
**kwargs,
) -> List[List[dict]]:
"""Search for a query vector/vectors.
In order for the search to process, a collection needs to have been either provided
at init or data needs to have been inserted.
Args:
data (Union[List[list], list]): The vector/vectors to search.
limit (int, optional): How many results to return per search. Defaults to 10.
filter(str, optional): A filter to use for the search. Defaults to None.
output_fields (List[str], optional): List of which field values to return. If None
specified, only primary fields including distances will be returned.
search_params (dict, optional): The search params to use for the search.
timeout (float, optional): Timeout to use, overides the client level assigned at init.
Defaults to None.
Raises:
ValueError: The collection being searched doesnt exist. Need to insert data first.
Returns:
List[List[dict]]: A nested list of dicts containing the result data. Embeddings are
not included in the result data.
"""
conn = self._get_connection()
try:
res = conn.search(
collection_name,
data,
anns_field or "",
search_params or {},
expression=filter,
limit=limit,
output_fields=output_fields,
partition_names=partition_names,
timeout=timeout,
**kwargs,
)
except Exception as ex:
logger.error("Failed to search collection: %s", collection_name)
raise ex from ex
@retry_on_rpc_failure()
def search(
self,
collection_name: str,
data: List[List[float]],
anns_field: str,
param: Dict,
limit: int,
expression: Optional[str] = None,
partition_names: Optional[List[str]] = None,
output_fields: Optional[List[str]] = None,
round_decimal: int = -1,
timeout: Optional[float] = None,
**kwargs,
):
check_pass_param(
limit=limit,
round_decimal=round_decimal,
anns_field=anns_field,
search_data=data,
partition_name_array=partition_names,
output_fields=output_fields,
guarantee_timestamp=kwargs.get("guarantee_timestamp", None),
timeout=timeout,
)
I tested with [data], this issue still occur.
@Aisuko What's the dimension of the vector data? Is it 384?
Hi @XuanYang-cn
Yes, data is our input, it is a list type. See above Prameters section.
len(data)
384
@Aisuko Got you. And what's the schema of your collection? PyMilvus uses schema to validate data, please print it out and let me know, THX.
Hi @XuanYang-cn I fixed the issue by replacing data
to [data]
self.client.search(
collection_name=collection_name,
data=[data],
limit=1,
search_params={'metric_type': 'COSINE', 'params': {}},
output_fields=["title"],
)
Is there an existing issue for this?
Describe the bug
Version:
Call search func
Prameters
The collection is https://huggingface.co/datasets/aisuko/squad01 and already loaded into db
The data was embedded by all-MiniLM-L6-v2-Q4_K_M-v2 with llamacpp
The whole error message
Expected Behavior
No response
Steps/Code To Reproduce behavior
No response
Environment details
server_config.yaml
):Default settings.