milvus-io / pymilvus

Python SDK for Milvus.
Apache License 2.0
1.04k stars 332 forks source link

[Enhancement]: schema validation is fragile #2299

Open zhengbuqian opened 1 month ago

zhengbuqian commented 1 month ago

Is there an existing issue for this?

What would you like to be added?

currently we validate server schema by comparing with user provided schema https://github.com/milvus-io/pymilvus/blob/76de0ab1caba89a939a784545e1a61d13ea139a3/pymilvus/orm/collection.py#L134.

with doc in doc out, we introduced tokenizer_params in params, which is a dict in user input, but a json string in server response. direct comparing will cause a failure.

now the tokenizer_params is simple so I used a temp resolution in https://github.com/milvus-io/pymilvus/pull/2298 to convert the json string back to a dict, but that will likely fail after we introduced more configs in tokenizer_params: keys in json may be reordered and the resulting dict will no longer equal.

Why is this needed?

No response

Anything else?

No response

zhengbuqian commented 1 month ago

/assign @zhengbuqian /assign @XuanYang-cn

zhengbuqian commented 1 month ago

this should be fixed before the Milvus 2.5 release

XuanYang-cn commented 1 month ago

@zhengbuqian we could impl an __eq__ func in schema to cutomize what's need to compare and what could be ignored.