milvus-io / pymilvus

Python SDK for Milvus.
Apache License 2.0
1.02k stars 324 forks source link

[Bug]: dynamic field exceeds max length (65536) #2318

Open micuentadecasa opened 3 days ago

micuentadecasa commented 3 days ago

Is there an existing issue for this?

Describe the bug

I'm using Kotaemon, that is a RAG that by default uses milvus lite, when moving to a external milvus and inserting items, I got this error:

dynamic field exceeds max length (65536)

I have tried to reduce the size of the metadata but still the error raises, is there a way to expand the size of the metadata field?

Expected Behavior

No response

Steps/Code To Reproduce behavior

-

Environment details

- Hardware/Softward conditions (OS, CPU, GPU, Memory):
- Method of installation (Docker, or from source):
- Milvus version (v0.3.1, or v0.4.0):
- Milvus configuration (Settings you made in `server_config.yaml`):

Anything else?

PC error: [insert_rows], <MilvusException: (code=1100, message=the length (381827) of dynamic field exceeds max length (65536): invalid parameter[expected=valid length dynamic field][actual=length exceeds max length])>, <Time:{'RPC start': '2024-10-29 07:36:26.031426', 'RPC error': '2024-10-29 07:36:26.449905'}> Exception in thread Thread-4 (): Traceback (most recent call last): File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/usr/local/lib/python3.10/threading.py", line 953, in run self._target(*self._args, self._kwargs) File "/app/libs/ktem/ktem/index/file/pipelines.py", line 399, in target=lambda: list(insert_chunks_to_vectorstore()) File "/app/libs/ktem/ktem/index/file/pipelines.py", line 387, in insert_chunks_to_vectorstore self.handle_chunks_vectorstore(chunks, file_id) File "/app/libs/ktem/ktem/index/file/pipelines.py", line 429, in handle_chunks_vectorstore self.vector_indexing.add_to_vectorstore(chunks) File "/app/libs/kotaemon/kotaemon/indices/vectorindex.py", line 139, in add_to_vectorstore self.vector_store.add( File "/app/libs/kotaemon/kotaemon/storages/vectorstores/milvus.py", line 106, in add return super().add(embeddings=embeddings, metadatas=metadatas, ids=ids) File "/app/libs/kotaemon/kotaemon/storages/vectorstores/base.py", line 135, in add return self._client.add(nodes=nodes) File "/usr/local/lib/python3.10/site-packages/llama_index/vector_stores/milvus/base.py", line 364, in add self._collection.insert(insert_batch) File "/usr/local/lib/python3.10/site-packages/pymilvus/orm/collection.py", line 507, in insert return conn.insert_rows( File "/usr/local/lib/python3.10/site-packages/pymilvus/decorators.py", line 148, in handler raise e from e File "/usr/local/lib/python3.10/site-packages/pymilvus/decorators.py", line 144, in handler return func(*args, *kwargs) File "/usr/local/lib/python3.10/site-packages/pymilvus/decorators.py", line 183, in handler return func(self, args, kwargs) File "/usr/local/lib/python3.10/site-packages/pymilvus/decorators.py", line 123, in handler raise e from e File "/usr/local/lib/python3.10/site-packages/pymilvus/decorators.py", line 87, in handler return func(*args, **kwargs) File "/usr/local/lib/python3.10/site-packages/pymilvus/client/grpc_handler.py", line 496, in insert_rows check_status(resp.status) File "/usr/local/lib/python3.10/site-packages/pymilvus/client/utils.py", line 63, in check_status raise MilvusException(status.code, status.reason, status.error_code) pymilvus.exceptions.MilvusException: <MilvusException: (code=1100, message=the length (381827) of dynamic field exceeds max length (65536): invalid parameter[expected=valid length dynamic field][actual=length exceeds max length])>

XuanYang-cn commented 3 days ago

@micuentadecasa what's your schema like? Maybe a fixed schema with INT64, VARCHAR, ARRAY and BOOL to store part of the dynamic field data.

micuentadecasa commented 3 days ago

@XuanYang-cn this is my schema

{'collection_name': 'llamacollection', 'auto_id': False, 'num_shards': 0, 'description': '', 'fields': [{'field_id': 100, 'name': 'id', 'description': '', 'type': <DataType.VARCHAR: 21>, 'params': {'max_length': 65535}, 'is_primary': True}, {'field_id': 101, 'name': 'embedding', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 1536}}], 'aliases': [], 'collection_id': 0, 'consistency_level': 0, 'properties': {}, 'num_partitions': 0, 'enable_dynamic_field': True}