milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
30.62k stars 2.92k forks source link

[Bug]: [null & default] The error message is not clear when set default value to ARRAY or JSON type field #36495

Open binbinlv opened 1 month ago

binbinlv commented 1 month ago

Is there an existing issue for this?

Environment

- Milvus version: master-latest
- Deployment mode(standalone or cluster):both
- MQ type(rocksmq, pulsar or kafka):  all  
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus-latest
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

The error message is not clear when set default value to ARRAY type field

Default value unsupported data type: 999

Expected Behavior

something like "Default value unsupported data type: ARRAY"

Steps To Reproduce

from pymilvus import CollectionSchema, FieldSchema
from pymilvus import Collection
from pymilvus import connections
from pymilvus import DataType
from pymilvus import Partition
from pymilvus import utility
import json
import random

connections.connect()

dim = 3
schema = CollectionSchema(fields=[
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=False),
    FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=3),
    FieldSchema(name="flat", dtype=DataType.INT64, default_value=100),
    FieldSchema(name="array", dtype=DataType.ARRAY, element_type=DataType.INT64, max_capacity=100, default_value=[100, 100])
])
collection = Collection("test_binbin_123", schema=schema)

Milvus Log

No response

Anything else?

No response

binbinlv commented 1 month ago

And the error message for JSON datatype is the same:

pymilvus.exceptions.ParamError: <ParamError: (code=1, message=Default value unsupported data type: 999)>

reproduced script:

from pymilvus import CollectionSchema, FieldSchema
from pymilvus import Collection
from pymilvus import connections
from pymilvus import DataType
from pymilvus import Partition
from pymilvus import utility
import json
import random

connections.connect()

dim = 3
schema = CollectionSchema(fields=[
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=False),
    FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=3),
    FieldSchema(name="flat", dtype=DataType.INT64, default_value=100),
    FieldSchema(name="json", dtype=DataType.JSON, default_value={"value": 100})
])

collection = Collection("test_binbin_123", schema=schema)
binbinlv commented 1 month ago

I think the error message is better to be something like "Default value unsupported data type: its DataType info"

binbinlv commented 1 month ago

And when default_value=100 for ARRAY field, the error message is misleading too. results:

code=1100, message=default value type mismatches field schema type: invalid parameter[expected=DataType_Int64][actual=not match]

Reproduced script:

from pymilvus import CollectionSchema, FieldSchema
from pymilvus import Collection
from pymilvus import connections
from pymilvus import DataType
from pymilvus import Partition
from pymilvus import utility
import json
import random

connections.connect()

dim = 3
schema = CollectionSchema(fields=[
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=False),
    FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=3),
    FieldSchema(name="flat", dtype=DataType.INT64, default_value=100),
    FieldSchema(name="array", dtype=DataType.ARRAY, element_type=DataType.INT64, max_capacity=100, default_value=100)
])

collection = Collection("test_binbin_123", schema=schema)
binbinlv commented 1 month ago

And this issue exists in vector field: (1) FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=3, default_value=10):

code=1100, message=default value type mismatches field schema type: invalid parameter[expected=DataType_Int64][actual=not match])>

(2) FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=3, default_value=[1, 2, 3])

(code=1, message=Default value unsupported data type: 999)
smellthemoon commented 4 weeks ago

And this issue exists in vector field: (1) FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=3, default_value=10):

code=1100, message=default value type mismatches field schema type: invalid parameter[expected=DataType_Int64][actual=not match])>

(2) FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=3, default_value=[1, 2, 3])

(code=1, message=Default value unsupported data type: 999)

datatype:999 means unknown. Maybe we can also do some type error in pymilvus later. Anyway, in #36840, I have made some improvements about others.

binbinlv commented 3 weeks ago

Fixed error:

  1. when default_value=100 for ARRAY field: new error message:

    code=1100, message=type not support default_value, type:Array, name:array: invalid parameter
  2. when default_value=100 for Vector field: new error message:

    code=1100, message=type not support default_value, type:FloatVector, name:vector: invalid parameter
binbinlv commented 3 weeks ago

unfixed error:

  1. set default value to ARRAY, JSON or VECTOR field, it reports error:
    code=1, message=Default value unsupported data type: 999
binbinlv commented 3 weeks ago

And this issue exists in vector field: (1) FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=3, default_value=10):

code=1100, message=default value type mismatches field schema type: invalid parameter[expected=DataType_Int64][actual=not match])>

(2) FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=3, default_value=[1, 2, 3])

(code=1, message=Default value unsupported data type: 999)

datatype:999 means unknown. Maybe we can also do some type error in pymilvus later. Anyway, in #36840, I have made some improvements about others.

Maybe we could report "Default value unsupported data type: [ARRAY, JSON, FLOAT_VECTOR...] " here list all the unsupported field type for default value?