milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.39k stars 2.82k forks source link

[Bug]: [import v2]When the dimension of the vector field in the imported file is inconsistent with the defined dimension, the error message is very confusing. #35060

Open zhuwenxing opened 1 month ago

zhuwenxing commented 1 month ago

Is there an existing issue for this?

Environment

- Milvus version:2.4-20240725-2822d872-amd64
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

the error message is error="%s not aligned->FloatVector: importing data failed" This error message is not detailed enough, and %s it doesn't display the actual field name.

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

image

Anything else?

No response

zhuwenxing commented 1 month ago

/assign @bigsheeper

PTAL

zhuwenxing commented 1 month ago
import pyarrow as pa

EMBEDDING_SIZE = 128

schema = pa.schema([
    ('id', pa.int64()),
    ('text', pa.string()),
    ('embedding', pa.list_(pa.float32(), EMBEDDING_SIZE))
])

data = [
    {'id': 1, 'text': 'Hello', 'embedding': [0.1] * EMBEDDING_SIZE},
    {'id': 2, 'text': 'World', 'embedding': [0.2] * (EMBEDDING_SIZE+1)},
]

table = pa.Table.from_pylist(data, schema=schema)
print(table)

when using pyarrow, if the dim is not matched, the error message will be unambiguous

    table = pa.Table.from_pylist(data, schema=schema)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyarrow/table.pxi", line 1984, in pyarrow.lib._Tabular.from_pylist
  File "pyarrow/table.pxi", line 6051, in pyarrow.lib._from_pylist
  File "pyarrow/table.pxi", line 4625, in pyarrow.lib.Table.from_arrays
  File "pyarrow/table.pxi", line 1562, in pyarrow.lib._sanitize_arrays
  File "pyarrow/array.pxi", line 385, in pyarrow.lib.asarray
  File "pyarrow/array.pxi", line 355, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 42, in pyarrow.lib._sequence_to_array
  File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Length of item not correct: expected 128 but got array of size 129
stale[bot] commented 6 days ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

yanliang567 commented 4 days ago

@zhuwenxing was this fixed