milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
31.1k stars 2.96k forks source link

[Bug]: [null & default] The "nullable" property is lost and data is not "None" when migrating the collection with the nullable fields from 2.x to 2.x #36346

Open binbinlv opened 2 months ago

binbinlv commented 2 months ago

Is there an existing issue for this?

Environment

- Milvus version: from master-latest to master-latest
- Deployment mode(standalone or cluster): both
- MQ type(rocksmq, pulsar or kafka):    all 
- SDK version(e.g. pymilvus v2.0.0rc2):2.5.0rc78
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

The "nullable" property is lost and data is not "None" when migrate the collection with the nullable fields from 2.x to 2.x before migration:

Does collection hello_milvus2 exist in Milvus: True
{'auto_id': True, 'description': 'hello_milvus2', 'fields': [{'name': 'pk', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': True}, {'name': 'random', 'description': '', 'type': <DataType.DOUBLE: 11>, 'nullable': True}, {'name': 'var', 'description': '', 'type': <DataType.VARCHAR: 21>, 'params': {'max_length': 65535}, 'nullable': True}, {'name': 'embeddings', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 8}}], 'enable_dynamic_field': False}
Number of entities in Milvus: hello_milvus2 : 6000

=== Start Creating index IVF_FLAT  ===

=== Start loading                  ===

=== Start searching based on vector similarity ===

hit: id: 452625224862197802, distance: 0.0, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862197802, distance: 0.0, entity: {'random': None, 'var': None}, var field: None
hit: id: 452625224862200803, distance: 0.0, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862200803, distance: 0.0, entity: {'random': None, 'var': None}, var field: None
hit: id: 452625224862200391, distance: 0.07805602252483368, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862200391, distance: 0.07805602252483368, entity: {'random': None, 'var': None}, var field: None
hit: id: 452625224862197803, distance: 0.0, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862197803, distance: 0.0, entity: {'random': None, 'var': None}, var field: None
hit: id: 452625224862200804, distance: 0.0, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862200804, distance: 0.0, entity: {'random': None, 'var': None}, var field: None
hit: id: 452625224862198387, distance: 0.11571306735277176, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862198387, distance: 0.11571306735277176, entity: {'random': None, 'var': None}, var field: None
search latency = 0.3032s

=== Start querying with `random > 0.5` ===

query result:
-{'random': None, 'var': None, 'embeddings': [0.07026907, 0.43795532, 0.72754675, 0.82399035, 0.65299004, 0.2594087, 0.53980845, 0.8232887], 'pk': 452625224862197429}
search latency = 0.7342s

=== Start hybrid searching with `random > 0.5` ===

hit: id: 452625224862197802, distance: 0.0, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862197802, distance: 0.0, entity: {'random': None, 'var': None}, var field: None
hit: id: 452625224862200803, distance: 0.0, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862200803, distance: 0.0, entity: {'random': None, 'var': None}, var field: None
hit: id: 452625224862200391, distance: 0.07805602252483368, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862200391, distance: 0.07805602252483368, entity: {'random': None, 'var': None}, var field: None
hit: id: 452625224862197803, distance: 0.0, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862197803, distance: 0.0, entity: {'random': None, 'var': None}, var field: None
hit: id: 452625224862200804, distance: 0.0, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862200804, distance: 0.0, entity: {'random': None, 'var': None}, var field: None
hit: id: 452625224862198387, distance: 0.11571306735277176, entity: {'random': None, 'var': None}, random field: None
hit: id: 452625224862198387, distance: 0.11571306735277176, entity: {'random': None, 'var': None}, var field: None
search latency = 0.2668s

=== Drop collection hello_milvus2  ===

After migration:

Does collection hello_milvus2 exist in Milvus: True
{'auto_id': True, 'description': 'hello_milvus2', 'fields': [{'name': 'pk', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': True}, {'name': 'random', 'description': '', 'type': <DataType.DOUBLE: 11>}, {'name': 'var', 'description': '', 'type': <DataType.VARCHAR: 21>, 'params': {'max_length': 65535}}, {'name': 'embeddings', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 8}}], 'enable_dynamic_field': True}
Number of entities in Milvus: hello_milvus2 : 6000

=== Start Creating index IVF_FLAT  ===

=== Start loading                  ===

=== Start searching based on vector similarity ===

hit: id: 452541625829561138, distance: 0.0, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829561138, distance: 0.0, entity: {'random': 0.0, 'var': ''}, var field:
hit: id: 452541625829564144, distance: 0.0, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829564144, distance: 0.0, entity: {'random': 0.0, 'var': ''}, var field:
hit: id: 452541625829563732, distance: 0.07805602252483368, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829563732, distance: 0.07805602252483368, entity: {'random': 0.0, 'var': ''}, var field:
hit: id: 452541625829561139, distance: 0.0, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829561139, distance: 0.0, entity: {'random': 0.0, 'var': ''}, var field:
hit: id: 452541625829564145, distance: 0.0, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829564145, distance: 0.0, entity: {'random': 0.0, 'var': ''}, var field:
hit: id: 452541625829561724, distance: 0.11571306735277176, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829561724, distance: 0.11571306735277176, entity: {'random': 0.0, 'var': ''}, var field:
search latency = 0.7000s

=== Start querying with `random > 0.5` ===

query result:
-{'embeddings': [0.07026907, 0.43795532, 0.72754675, 0.82399035, 0.65299004, 0.2594087, 0.53980845, 0.8232887], 'pk': 452541625829560765, 'random': 0.0, 'var': ''}
search latency = 0.8892s

=== Start hybrid searching with `random > 0.5` ===

hit: id: 452541625829561138, distance: 0.0, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829561138, distance: 0.0, entity: {'random': 0.0, 'var': ''}, var field:
hit: id: 452541625829564144, distance: 0.0, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829564144, distance: 0.0, entity: {'random': 0.0, 'var': ''}, var field:
hit: id: 452541625829563732, distance: 0.07805602252483368, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829563732, distance: 0.07805602252483368, entity: {'random': 0.0, 'var': ''}, var field:
hit: id: 452541625829561139, distance: 0.0, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829561139, distance: 0.0, entity: {'random': 0.0, 'var': ''}, var field:
hit: id: 452541625829564145, distance: 0.0, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829564145, distance: 0.0, entity: {'random': 0.0, 'var': ''}, var field:
hit: id: 452541625829561724, distance: 0.11571306735277176, entity: {'random': 0.0, 'var': ''}, random field: 0.0
hit: id: 452541625829561724, distance: 0.11571306735277176, entity: {'random': 0.0, 'var': ''}, var field:
search latency = 0.2628s

=== Drop collection hello_milvus2  ===

Expected Behavior

Data with "None" is not lost

Steps To Reproduce

1. prepare data with "None" data inserted
2. migrate 
https://milvus.io/docs/from-m2x.md

Milvus Log

No response

Anything else?

No response

tedxu commented 1 month ago

The migration tool need an upgrade, per offline discussion with @smellthemoon.

binbinlv commented 3 weeks ago

any progress here?

binbinlv commented 3 weeks ago

/assign @wenhuiZilliz

sre-ci-robot commented 3 weeks ago

@binbinlv: GitHub didn't allow me to assign the following users: wenhuiZilliz.

Note that only milvus-io members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide

In response to [this](https://github.com/milvus-io/milvus/issues/36346#issuecomment-2463653019): >/assign @wenhuiZilliz Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
sre-ci-robot commented 3 weeks ago

@xiaofan-luan: GitHub didn't allow me to assign the following users: yelusion2.

Note that only milvus-io members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide

In response to [this](https://github.com/milvus-io/milvus/issues/36346#issuecomment-2465569394): >/assign @yelusion2 > >please help on it Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.