milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.47k stars 2.82k forks source link

[Bug]: If there are some keywords in the column name, the parsing expression will cause an error #35873

Open zhuwenxing opened 2 weeks ago

zhuwenxing commented 2 weeks ago

Is there an existing issue for this?

Environment

- Milvus version:2.4/master
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

schema

        fields = [
            FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
            FieldSchema(name="contains", dtype=DataType.ARRAY, element_type=array_element_data_type, max_capacity=2000, **additional_params),
            FieldSchema(name="contains_any", dtype=DataType.ARRAY, element_type=array_element_data_type,
                        max_capacity=2000, **additional_params),
            FieldSchema(name="contains_all", dtype=DataType.ARRAY, element_type=array_element_data_type,
                        max_capacity=2000, **additional_params),
            FieldSchema(name="equals", dtype=DataType.ARRAY, element_type=array_element_data_type, max_capacity=2000, **additional_params),
            FieldSchema(name="array_length", dtype=DataType.ARRAY, element_type=array_element_data_type,
                        max_capacity=2000, **additional_params),
            FieldSchema(name="array_access", dtype=DataType.ARRAY, element_type=array_element_data_type,
                        max_capacity=2000, **additional_params),
            FieldSchema(name="emb", dtype=DataType.FLOAT_VECTOR, dim=128)
        ]

expr

"array_length(array_length) == 10"

error

failed to create query plan: cannot parse expression: array_length(array_length) == 10, error: line 1:13 mismatched input 'array_length' expecting {Identifier, JSONIdentifier}: invalid parameter)

If field name array_length change to array_length_field, expr "array_length(array_length_field) == 10" can success

Expected Behavior

"array_length(array_length) == 10", the second array_length can be treat as field name.

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

xiaofan-luan commented 2 weeks ago

I think we'd better banned all the keywords

@xiaocai2333 could you help on improving it?

yanliang567 commented 2 weeks ago

/assign @xiaocai2333 /unassign

xiaocai2333 commented 2 weeks ago

Should we also prohibit these keywords as dynamic field?

xiaofan-luan commented 2 weeks ago

Should we also prohibit these keywords as dynamic field?

it's gonna to be hard to check with, are we gonna to parse json on every insertion?

xiaocai2333 commented 1 week ago

Should we also prohibit these keywords as dynamic field?

it's gonna to be hard to check with, are we gonna to parse json on every insertion?

milvus will parse the dynamic json now to determine if the dynamic field name is $meta. So we can prohibit these keywords for display field name and dynamic field name? But this will be an incompatible change.

xiaofan-luan commented 1 week ago

I think we should go forward to do the check

xiaocai2333 commented 5 days ago

after discussion, the dynamic field names can be bypassed through $meta["XXX"], so there is no restriction on dynamic field name. Keep the flexibility of dynamic fields.