Closed zhuwenxing closed 1 year ago
/assign @jiaoew1991 /unassign
/assign @chyezh
it seems that there's no data loss after reinstallation. all data has been flushed, so the problem cannot be caused by growing segments.
the problem may arise in the computational logic with special input, I will try to reproduce it.
version: 2.2.0-20230612-ae2fe478
[2023-06-12T13:05:57.358Z] 2023-06-12 13:05:57.198 | INFO | MainThread |utils:load_and_search:206 - collection name: task_1_IVF_FLAT
[2023-06-12T13:05:57.358Z] 2023-06-12 13:05:57.198 | INFO | MainThread |utils:load_and_search:207 - load collection
[2023-06-12T13:05:57.358Z] 2023-06-12 13:05:57.203 | INFO | MainThread |utils:load_and_search:211 - load time: 0.0050
[2023-06-12T13:05:57.358Z] 2023-06-12 13:05:57.216 | INFO | MainThread |utils:load_and_search:225 - {'metric_type': 'L2', 'params': {'nprobe': 10}}
[2023-06-12T13:05:57.358Z] 2023-06-12 13:05:57.216 | INFO | MainThread |utils:load_and_search:228 -
[2023-06-12T13:05:57.358Z] Search...
[2023-06-12T13:05:57.358Z] 2023-06-12 13:05:57.220 | INFO | MainThread |utils:load_and_search:239 - hit: id: 976, distance: 29.795345306396484, entity: {'count': 976, 'random_value': -15.0}
[2023-06-12T13:05:57.358Z] 2023-06-12 13:05:57.221 | INFO | MainThread |utils:load_and_search:239 - hit: id: 766, distance: 30.546741485595703, entity: {'count': 766, 'random_value': -11.0}
[2023-06-12T13:05:57.358Z] 2023-06-12 13:05:57.221 | INFO | MainThread |utils:load_and_search:239 - hit: id: 2403, distance: 31.58251953125, entity: {'count': 2403, 'random_value': -17.0}
[2023-06-12T13:05:57.358Z] 2023-06-12 13:05:57.221 | INFO | MainThread |utils:load_and_search:239 - hit: id: 2486, distance: 32.51908874511719, entity: {'count': 2486, 'random_value': -12.0}
[2023-06-12T13:05:57.358Z] Traceback (most recent call last):
[2023-06-12T13:05:57.358Z] File "scripts/action_after_reinstall.py", line 46, in <module>
[2023-06-12T13:05:57.358Z] task_1(data_size, host)
[2023-06-12T13:05:57.358Z] File "scripts/action_after_reinstall.py", line 14, in task_1
[2023-06-12T13:05:57.358Z] load_and_search(prefix)
[2023-06-12T13:05:57.358Z] File "/home/jenkins/agent/workspace/tests/python_client/deploy/scripts/utils.py", line 241, in load_and_search
[2023-06-12T13:05:57.358Z] assert len(ids) == topK, f"get {len(ids)} results, but topK is {topK}"
[2023-06-12T13:05:57.358Z] AssertionError: get 4 results, but topK is 5
log:
artifacts-kafka-standalone-reinstall-1044-pytest-logs.tar.gz
[Uploading artifacts-kafka-standalone-reinstall-1044-server-first-deployment-logs.tar.gz…]()
artifacts-kafka-standalone-reinstall-1044-server-second-deployment-logs.tar.gz
/assign @congqixia please take a look. the search or query result is partial.
It reproduced again with image tag 2.2.0-20230707-511173a0
failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/deploy_test_kafka_for_release_cron/detail/deploy_test_kafka_for_release_cron/1179/pipeline
log:
artifacts-kafka-cluster-reinstall-1179-pytest-logs.tar.gz artifacts-kafka-cluster-reinstall-1179-server-first-deployment-logs.tar.gz artifacts-kafka-cluster-reinstall-1179-server-second-deployment-logs.tar.gz
Is that possible, by using IVF_FLAT, 10 vector was recalled in 10 cluster in IVF, but filter the 6 vector by expr count > 500
?
the search vector is [1,1,1,1,....] locating the corner of the vector space.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
@zhuwenxing @chyezh any updates
image: 2.3.0-20230918-dde27711-amd64
[2023-09-18T13:38:15.243Z] 2023-09-18 13:38:15.140 | INFO | MainThread |utils:load_and_search:259 - ###########
[2023-09-18T13:38:15.243Z] 2023-09-18 13:38:15.143 | INFO | MainThread |utils:load_and_search:206 - collection name: task_2_IVF_FLAT
[2023-09-18T13:38:15.243Z] 2023-09-18 13:38:15.143 | INFO | MainThread |utils:load_and_search:207 - load collection
[2023-09-18T13:38:19.400Z] 2023-09-18 13:38:19.232 | INFO | MainThread |utils:load_and_search:211 - load time: 4.0887
[2023-09-18T13:38:19.400Z] 2023-09-18 13:38:19.243 | INFO | MainThread |utils:load_and_search:225 - {'metric_type': 'L2', 'params': {'nprobe': 10}}
[2023-09-18T13:38:19.400Z] 2023-09-18 13:38:19.243 | INFO | MainThread |utils:load_and_search:228 -
[2023-09-18T13:38:19.400Z] Search...
[2023-09-18T13:38:19.655Z] 2023-09-18 13:38:19.423 | INFO | MainThread |utils:load_and_search:239 - hit: id: 764, distance: 30.432262420654297, entity: {'count': 764, 'random_value': -18.0}
[2023-09-18T13:38:19.655Z] 2023-09-18 13:38:19.423 | INFO | MainThread |utils:load_and_search:239 - hit: id: 2455, distance: 31.647565841674805, entity: {'count': 2455, 'random_value': -17.0}
[2023-09-18T13:38:19.655Z] 2023-09-18 13:38:19.423 | INFO | MainThread |utils:load_and_search:239 - hit: id: 2424, distance: 32.878353118896484, entity: {'count': 2424, 'random_value': -17.0}
[2023-09-18T13:38:19.655Z] 2023-09-18 13:38:19.423 | INFO | MainThread |utils:load_and_search:239 - hit: id: 2737, distance: 33.31123733520508, entity: {'count': 2737, 'random_value': -14.0}
[2023-09-18T13:38:19.655Z] Traceback (most recent call last):
[2023-09-18T13:38:19.655Z] File "scripts/action_after_reinstall.py", line 47, in <module>
[2023-09-18T13:38:19.655Z] task_2(data_size, host)
[2023-09-18T13:38:19.655Z] File "scripts/action_after_reinstall.py", line 33, in task_2
[2023-09-18T13:38:19.655Z] load_and_search(prefix)
[2023-09-18T13:38:19.655Z] File "/home/jenkins/agent/workspace/tests/python_client/deploy/scripts/utils.py", line 241, in load_and_search
[2023-09-18T13:38:19.655Z] assert len(ids) == topK, f"get {len(ids)} results, but topK is {topK}"
[2023-09-18T13:38:19.655Z] AssertionError: get 4 results, but topK is 5
log: artifacts-kafka-standalone-reinstall-1450-pytest-logs.tar.gz artifacts-kafka-standalone-reinstall-1450-server-first-deployment-logs.tar.gz artifacts-kafka-standalone-reinstall-1450-server-second-deployment-logs.tar.gz
failed again failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/deploy_test_kafka_for_release_cron/detail/deploy_test_kafka_for_release_cron/1446/pipeline log: artifacts-kafka-standalone-reinstall-1446-pytest-logs (1).tar.gz artifacts-kafka-standalone-reinstall-1446-server-first-deployment-logs (1).tar.gz artifacts-kafka-standalone-reinstall-1446-server-second-deployment-logs (1).tar.gz
[2023-09-18T13:38:15.243Z] 2023-09-18 13:38:15.140 | INFO | MainThread |utils:load_and_search:257 - query latency: 0.0047s
[2023-09-18T13:38:15.243Z] 2023-09-18 13:38:15.140 | INFO | MainThread |utils:load_and_search:259 - ###########
[2023-09-18T13:38:15.243Z] 2023-09-18 13:38:15.143 | INFO | MainThread |utils:load_and_search:206 - collection name: task_2_IVF_FLAT
[2023-09-18T13:38:15.243Z] 2023-09-18 13:38:15.143 | INFO | MainThread |utils:load_and_search:207 - load collection
[2023-09-18T13:38:19.400Z] 2023-09-18 13:38:19.232 | INFO | MainThread |utils:load_and_search:211 - load time: 4.0887
[2023-09-18T13:38:19.400Z] 2023-09-18 13:38:19.243 | INFO | MainThread |utils:load_and_search:225 - {'metric_type': 'L2', 'params': {'nprobe': 10}}
[2023-09-18T13:38:19.400Z] 2023-09-18 13:38:19.243 | INFO | MainThread |utils:load_and_search:228 -
[2023-09-18T13:38:19.400Z] Search...
[2023-09-18T13:38:19.655Z] 2023-09-18 13:38:19.423 | INFO | MainThread |utils:load_and_search:239 - hit: id: 764, distance: 30.432262420654297, entity: {'count': 764, 'random_value': -18.0}
[2023-09-18T13:38:19.655Z] 2023-09-18 13:38:19.423 | INFO | MainThread |utils:load_and_search:239 - hit: id: 2455, distance: 31.647565841674805, entity: {'count': 2455, 'random_value': -17.0}
[2023-09-18T13:38:19.655Z] 2023-09-18 13:38:19.423 | INFO | MainThread |utils:load_and_search:239 - hit: id: 2424, distance: 32.878353118896484, entity: {'count': 2424, 'random_value': -17.0}
[2023-09-18T13:38:19.655Z] 2023-09-18 13:38:19.423 | INFO | MainThread |utils:load_and_search:239 - hit: id: 2737, distance: 33.31123733520508, entity: {'count': 2737, 'random_value': -14.0}
[2023-09-18T13:38:19.655Z] Traceback (most recent call last):
[2023-09-18T13:38:19.655Z] File "scripts/action_after_reinstall.py", line 47, in <module>
[2023-09-18T13:38:19.655Z] task_2(data_size, host)
[2023-09-18T13:38:19.655Z] File "scripts/action_after_reinstall.py", line 33, in task_2
[2023-09-18T13:38:19.655Z] load_and_search(prefix)
[2023-09-18T13:38:19.655Z] File "/home/jenkins/agent/workspace/tests/python_client/deploy/scripts/utils.py", line 241, in load_and_search
[2023-09-18T13:38:19.655Z] assert len(ids) == topK, f"get {len(ids)} results, but topK is {topK}"
[2023-09-18T13:38:19.655Z] AssertionError: get 4 results, but topK is 5
I have reproduced the same problem with rocksmq in no-chaos environment.
In these test case, new 3000 vectors is always inserted with same primary key (field count
) as existed vectors after reinstallation.
On searching, there's one segment. Some vectors with same primary key in ivf index was returned from these segment, and was deduplicated at reduced time. It's expected case under current Milvus implementation, but not a bug. Please modify the test case to avoid duplicate primary key in these test.
/assign @zhuwenxing /unassign
Is there an existing issue for this?
Environment
Current Behavior
Expected Behavior
Steps To Reproduce
No response
Milvus Log
failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/deploy_test_kafka_for_release_cron/detail/deploy_test_kafka_for_release_cron/993/pipeline
log:
artifacts-kafka-cluster-reinstall-993-server-first-deployment-logs.tar.gz
artifacts-kafka-cluster-reinstall-993-server-second-deployment-logs.tar.gz
artifacts-kafka-cluster-reinstall-993-pytest-logs.tar.gz
Anything else?
No response