Closed elstic closed 1 year ago
- Milvus version:2.2.0-20221121-efa1cf7f
- Deployment mode(standalone or cluster):cluster
- SDK version(e.g. pymilvus v2.0.0rc2): 2.2.0.dev72
server-instance
fouram-tag-no-clean-p4l24-1
server-configmap
server-cluster-8c64m-querynode2
client-configmap
client-random-locust-100m-hnsw-ddl-r8-w2-60h-con
server:
fouram-tag-no-clean-p4l24-1-etcd-0 1/1 Running 0 5m54s 10.104.9.169 4am-node14 <none> <none>
fouram-tag-no-clean-p4l24-1-etcd-1 1/1 Running 0 5m54s 10.104.6.251 4am-node13 <none> <none>
fouram-tag-no-clean-p4l24-1-etcd-2 1/1 Running 0 5m54s 10.104.5.139 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-milvus-datacoord-78c96cdbb5-vq454 1/1 Running 1 (113s ago) 5m54s 10.104.5.128 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-milvus-datanode-5d9bff85f8-dfbvl 1/1 Running 1 (113s ago) 5m54s 10.104.4.224 4am-node11 <none> <none>
fouram-tag-no-clean-p4l24-1-milvus-indexcoord-74bbc6b5f8-xfkm4 1/1 Running 1 (113s ago) 5m54s 10.104.4.225 4am-node11 <none> <none>
fouram-tag-no-clean-p4l24-1-milvus-indexnode-8b56f79ff-cnt6n 1/1 Running 0 5m54s 10.104.5.134 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-milvus-proxy-75df4f4864-mdkqr 1/1 Running 1 (112s ago) 5m54s 10.104.5.129 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-milvus-querycoord-78b6cc44c7-qhvct 1/1 Running 1 (112s ago) 5m54s 10.104.5.135 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-milvus-querynode-5f6786db4f-dmppg 1/1 Running 0 5m54s 10.104.6.247 4am-node13 <none> <none>
fouram-tag-no-clean-p4l24-1-milvus-querynode-5f6786db4f-jdxr2 1/1 Running 0 5m54s 10.104.5.127 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-milvus-rootcoord-5784b59f-cjdzs 1/1 Running 1 (2m22s ago) 5m54s 10.104.5.132 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-minio-0 1/1 Running 0 5m54s 10.104.6.249 4am-node13 <none> <none>
fouram-tag-no-clean-p4l24-1-minio-1 1/1 Running 0 5m54s 10.104.9.167 4am-node14 <none> <none>
fouram-tag-no-clean-p4l24-1-minio-2 1/1 Running 0 5m54s 10.104.5.138 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-minio-3 1/1 Running 0 5m54s 10.104.4.226 4am-node11 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-bookie-0 1/1 Running 0 5m54s 10.104.6.248 4am-node13 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-bookie-1 1/1 Running 0 5m54s 10.104.9.168 4am-node14 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-bookie-2 1/1 Running 0 5m54s 10.104.1.242 4am-node10 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-bookie-init-6mtvd 0/1 Completed 0 5m54s 10.104.5.137 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-broker-0 1/1 Running 0 5m54s 10.104.5.131 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-proxy-0 1/1 Running 0 5m54s 10.104.4.223 4am-node11 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-pulsar-init-n7nxx 0/1 Completed 0 5m54s 10.104.5.136 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-recovery-0 1/1 Running 0 5m54s 10.104.5.130 4am-node12 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-zookeeper-0 1/1 Running 0 5m54s 10.104.6.250 4am-node13 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-zookeeper-1 1/1 Running 0 5m18s 10.104.4.227 4am-node11 <none> <none>
fouram-tag-no-clean-p4l24-1-pulsar-zookeeper-2 1/1 Running 0 4m48s 10.104.9.170 4am-node14 <none> <none>
client log:
[2022-11-21 16:09:11,276] [ DEBUG] - 0 users have been stopped, 2 still running (locust.runners:281)
[2022-11-21 16:09:11,282] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when search)>, <Time:{'RPC start': '2022-11-21 16:09:11.276462', 'RPC error': '2022-11-21 16:09:11.282228'}> (pymilvus.decorators:108)
[2022-11-21 16:09:11,283] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when search)>, <Time:{'RPC start': '2022-11-21 16:09:11.277090', 'RPC error': '2022-11-21 16:09:11.283167'}> (pymilvus.decorators:108)
[2022-11-21 16:09:11,285] [ ERROR] - RPC error: [query], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when query)>, <Time:{'RPC start': '2022-11-21 16:09:11.283056', 'RPC error': '2022-11-21 16:09:11.285709'}> (pymilvus.decorators:108)
[2022-11-21 16:09:11,286] [ ERROR] - RPC error: [query], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when query)>, <Time:{'RPC start': '2022-11-21 16:09:11.283569', 'RPC error': '2022-11-21 16:09:11.286583'}> (pymilvus.decorators:108)
[2022-11-21 16:09:11,289] [ ERROR] - RPC error: [query], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when query)>, <Time:{'RPC start': '2022-11-21 16:09:11.286489', 'RPC error': '2022-11-21 16:09:11.289105'}> (pymilvus.decorators:108)
[2022-11-21 16:09:11,290] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when search)>, <Time:{'RPC start': '2022-11-21 16:09:11.286901', 'RPC error': '2022-11-21 16:09:11.290052'}> (pymilvus.decorators:108)
[2022-11-21 16:09:11,322] [ ERROR] - RPC error: [query], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when query)>, <Time:{'RPC start': '2022-11-21 16:09:11.289957', 'RPC error': '2022-11-21 16:09:11.322008'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,276] [ DEBUG] - Ramping to {"MyUser": 4} (4 total users) (locust.runners:341)
[2022-11-21 16:09:12,276] [ DEBUG] - Spawning additional {"MyUser": 2} ({"MyUser": 2} already running)... (locust.runners:206)
[2022-11-21 16:09:12,276] [ DEBUG] - 4 users spawned (locust.runners:220)
[2022-11-21 16:09:12,276] [ DEBUG] - All users of class MyUser spawned (locust.runners:221)
[2022-11-21 16:09:12,277] [ DEBUG] - 0 users have been stopped, 4 still running (locust.runners:281)
[2022-11-21 16:09:12,312] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when search)>, <Time:{'RPC start': '2022-11-21 16:09:12.277762', 'RPC error': '2022-11-21 16:09:12.312278'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,315] [ ERROR] - RPC error: [query], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when query)>, <Time:{'RPC start': '2022-11-21 16:09:12.312968', 'RPC error': '2022-11-21 16:09:12.315664'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,318] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when search)>, <Time:{'RPC start': '2022-11-21 16:09:12.315962', 'RPC error': '2022-11-21 16:09:12.318737'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,321] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when search)>, <Time:{'RPC start': '2022-11-21 16:09:12.318933', 'RPC error': '2022-11-21 16:09:12.321608'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,323] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when search)>, <Time:{'RPC start': '2022-11-21 16:09:12.321790', 'RPC error': '2022-11-21 16:09:12.323695'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,325] [ ERROR] - RPC error: [query], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when query)>, <Time:{'RPC start': '2022-11-21 16:09:12.323873', 'RPC error': '2022-11-21 16:09:12.325701'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,327] [ ERROR] - RPC error: [query], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when query)>, <Time:{'RPC start': '2022-11-21 16:09:12.325879', 'RPC error': '2022-11-21 16:09:12.327646'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,330] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when search)>, <Time:{'RPC start': '2022-11-21 16:09:12.327819', 'RPC error': '2022-11-21 16:09:12.330014'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,332] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when search)>, <Time:{'RPC start': '2022-11-21 16:09:12.330492', 'RPC error': '2022-11-21 16:09:12.332711'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,334] [ ERROR] - RPC error: [query], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when query)>, <Time:{'RPC start': '2022-11-21 16:09:12.332894', 'RPC error': '2022-11-21 16:09:12.334780'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,336] [ ERROR] - RPC error: [query], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when query)>, <Time:{'RPC start': '2022-11-21 16:09:12.334951', 'RPC error': '2022-11-21 16:09:12.336823'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,339] [ ERROR] - RPC error: [search], <MilvusException: (code=1, message=collection:sift_100m_128_l2 or partition:[] not loaded into memory when search)>, <Time:{'RPC start': '2022-11-21 16:09:12.336998', 'RPC error': '2022-11-21 16:09:12.338999'}> (pymilvus.decorators:108)
[2022-11-21 16:09:12,339] [ DEBUG] - [scene_test] Start scene test : scene_test_9613_331926 (milvus_benchmark.client:634)
[2022-11-21 16:09:12,362] [ INFO] - Create collection: <scene_test_9613_331926> successfully (milvus_benchmark.client:158)
[2022-11-21 16:09:12,362] [ DEBUG] - Milvus create_collection run in 0.0233s (milvus_benchmark.client:57)
[2022-11-21 16:09:13,277] [ DEBUG] - Ramping to {"MyUser": 6} (6 total users) (locust.runners:341)
[2022-11-21 16:09:13,278] [ DEBUG] - Spawning additional {"MyUser": 2} ({"MyUser": 4} already running)... (locust.runners:206)
[2022-11-21 16:09:13,278] [ DEBUG] - 6 users spawned (locust.runners:220)
[2022-11-21 16:09:13,278] [ DEBUG] - All users of class MyUser spawned (locust.runners:221)
No response
1. create a collection
2. build hnsw index
3. insert 100m data
4. build index again
5. load collection
6. concurrent search, load, query, scene_test ==> raise error
Complete client log:
pod status:
Complete client log: main-logs2.txt.zip
client-random-locust-100m-hnsw-ddl-r8-w2-60h-con
locust_random_concurrent_performance:
collections:
-
collection_name: sift_100m_128_l2
ni_per: 50000
build_index: true
index_type: hnsw
index_param:
M: 8
efConstruction: 200
task:
types:
-
type: query
weight: 8
params:
top_k: 10
nq: 10
search_param:
ef: 16
-
type: load
weight: 1
-
type: get
weight: 8
params:
ids_length: 10
-
type: scene_test
weight: 2
connection_num: 1
clients_num: 20
spawn_rate: 2
# during_time: 1h
during_time: 12h
datanode:
/assign @jiaoew1991 /unassign
/assign @weiliu1031 /unassign
Background: there is a collection sift_100m_128_l2, which will be automatically loaded when query coord restart。
Problems: after query coord restarts, it saw three query nodes here: A, B, C. and A for replica_1, B, C for replica_2. then when load segments in sift_100m_128_l2. query node A down and no query node available in replica_1, so loading will never success.
need fix:
need improve:
more discuss:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
@elstic is this still an issue
@elstic is this still an issue
This issue did not occur in version 2.2.3, I will close this issue
/close
@elstic: Closing this issue.
Is there an existing issue for this?
Environment
Current Behavior
server:
client log:
Expected Behavior
No response
Steps To Reproduce
Milvus Log
Complete client log:
[Uploading main-logs.txt.zip…]()
Anything else?
client-random-locust-concurrent-replica2-search-100m-ddl