milvus-io / milvus-tools

A data migration tool for Milvus.
Apache License 2.0
71 stars 21 forks source link

Error when reading milvus data for empty collection #32

Closed FT-JMendyk closed 3 years ago

FT-JMendyk commented 3 years ago

Issue description

When trying to read an empty collection, milvusdm fails saying that total_vectors variable is referenced before assignment. This error origins from get_files_data function in read_milvus_data.py: https://github.com/milvus-io/milvus-tools/blob/41143e5f4f5ef3693e4abd903e6b443b98804cb4/pymilvusdm/core/read_milvus_data.py#L55-L80

When either segment_list or row_list is empty, the for loop won't run and thus an attempt is made to return total_vectors and total_ids which haven't been initialized.

I've created a PR with a simple fix/workaround for this issue: https://github.com/milvus-io/milvus-tools/pull/33

Reproduction steps

Using Milvus 1.1.1, pymilvus==1.1.1 and pymilvusdm==2.0.

  1. Create collection using pymilvus:

    _DIM = 8
    from milvus import Milvus, IndexType, MetricType, Status
    milvus = Milvus('127.0.0.1', '19530')
    collection_name = 'example_collection'
    param = { 'collection_name': collection_name, 'dimension': _DIM }
    milvus.create_collection(param)
    milvus.flush([collection_name])
  2. Prepare configuration YAML M2H.yml:

    M2H:
    milvus_version: 1.1.1
    source_milvus_path: '<SOURCE_MILVUS_PATH>'
    mysql_parameter:
    host: '127.0.0.1'
    user: 'root'
    port: 3306
    password: 'password'
    database: 'milvus'
    source_collection:
    example_collection:
    data_dir: 'backup'
  3. Call milvusdm --yaml M2H.yml:

    <TIMESTAMP> | INFO | milvus_to_hdf5.py | read_milvus_data | 49 | Ready to read all data of collection: example_collection/partitions: [None]
    0%|                                                                                                                      | 0/1 [00:00<?, ?it/s]
    <TIMESTAMP>| ERROR | milvus_to_hdf5.py | read_milvus_data | 56 | Error with: local variable 'total_vectors' referenced before assignment

Same error happens when non-default partition is used and contains some vectors, while the default partition stays empty.

shiyu22 commented 3 years ago

Closed by #33 .

Welcome to be a contributor to Milvus-tools :)