milvus-io / pymilvus

Python SDK for Milvus.
Apache License 2.0
981 stars 312 forks source link

[Bug]: CollectionSchema does not support num_partitions when creating schemas with partition_key_field #2110

Open juanandreas opened 3 months ago

juanandreas commented 3 months ago

Is there an existing issue for this?

Describe the bug

According to this documentation: https://milvus.io/docs/use-partition-key.md, we can define num_partitions as an argument when creating a schema.

schema = MilvusClient.create_schema(
    auto_id=False,
    enable_dynamic_field=True,
    # highlight-next-line
    partition_key_field="color",
    num_partitions=64 # Number of partitions. Defaults to 64.
)

I tried to assign a different value to num_partitions but keep seeing 64 partitions when running a

client.list_partition()

Looking at the source code for CollectionSchema

class CollectionSchema:
    def __init__(self, fields: List, description: str = "", **kwargs):
        self._kwargs = copy.deepcopy(kwargs)
        self._fields = []
        self._description = description
        # if "enable_dynamic_field" is not in kwargs, we keep None here
        self._enable_dynamic_field = self._kwargs.get("enable_dynamic_field", None)
        self._primary_field = None
        self._partition_key_field = None
        self._clustering_key_field = None

There is no field for num_partitions. Is this a bug?

Expected Behavior

No response

Steps/Code To Reproduce behavior

No response

Environment details

No response

Anything else?

No response

SimFG commented 3 months ago

@juanandreas I checked and there should be an error in usage. You can refer to the following example:

import time
import numpy as np
from pymilvus import (
    MilvusClient,
    DataType
)

fmt = "\n=== {:30} ===\n"
dim = 8
collection_name = "hello_milvus10"
milvus_client = MilvusClient("http://localhost:19530")

has_collection = milvus_client.has_collection(collection_name, timeout=5)
if has_collection:
    milvus_client.drop_collection(collection_name)

schema = milvus_client.create_schema(enable_dynamic_field=True, partition_key_field="num")
schema.add_field("id", DataType.INT64, is_primary=True)
schema.add_field("embeddings", DataType.FLOAT_VECTOR, dim=dim)
schema.add_field("title", DataType.VARCHAR, max_length=64)
schema.add_field("num", DataType.INT64)

index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name = "embeddings", metric_type="L2")
milvus_client.create_collection(collection_name, schema=schema, index_params=index_params, consistency_level="Strong", num_partitions=16)

print(fmt.format("    all collections    "))
print(milvus_client.list_collections())
print(milvus_client.list_partitions(collection_name=collection_name))

output: image

juanandreas commented 3 months ago

Thank you! I think the documentation should clarify that num_partitions argument should be defined at client.create_collection(), not client.create_schema().

Can you share how to write into this collection using do_bulk_insert? This may be a related bug.

I am currently experiencing:

milvus.do_bulk_insert(collection_name=COLLECTION, files=list_of_files, partition_name=partition_name)

RPC error: [do_bulk_insert], <MilvusException: (code=2100, message=not allow to set partition name for collection with partition key: importing data failed)>

if I exclude the partition_name,

milvus.do_bulk_insert(collection_name=COLLECTION, files=list_of_files)

TypeError: do_bulk_insert() missing 1 required positional argument: 'partition_name'

This seems contradictory.

SimFG commented 3 months ago

Which package is milvus used here? Is it pymilvus.utility? in milvus.do_bulk_insert

juanandreas commented 3 months ago

Which package is milvus used here? Is it pymilvus.utility? in milvus.do_bulk_insert


import pymilvus

milvus = pymilvus.Milvus(host=MILVUS_HOST, port=MILVUS_PORT) milvus.do_bulk_insert(collection_name=COLLECTION, files=list_of_files, partition_name=partition_name)



I am running 2.4.3 of pymilvus and milvus version 2.4.3
XuanYang-cn commented 2 months ago

/assign