Open wangting0128 opened 1 week ago
@wangting0128 same as https://github.com/milvus-io/milvus/issues/37553, please verify it.
argo task: memory-opt-scenes-7w2vb image: master-20241111-fca946de-amd64
argo task:memory-opt-scenes-2x7j4 test case name:test_inverted_locust_hnsw_diskann_dml_dql_cluster image:master-20241114-cd181e4c-amd64
server:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
memory-opt-scenes-2x7j4-3-etcd-0 1/1 Running 0 173m 10.104.34.97 4am-node37 <none> <none>
memory-opt-scenes-2x7j4-3-etcd-1 1/1 Running 0 173m 10.104.19.85 4am-node28 <none> <none>
memory-opt-scenes-2x7j4-3-etcd-2 1/1 Running 0 173m 10.104.23.153 4am-node27 <none> <none>
memory-opt-scenes-2x7j4-3-milvus-datanode-5fbf6cf547-qkrst 1/1 Running 1 (173m ago) 173m 10.104.6.199 4am-node13 <none> <none>
memory-opt-scenes-2x7j4-3-milvus-indexnode-79bb49f75d-tf6xd 1/1 Running 1 (173m ago) 173m 10.104.32.65 4am-node39 <none> <none>
memory-opt-scenes-2x7j4-3-milvus-indexnode-79bb49f75d-z7hc4 1/1 Running 2 (173m ago) 173m 10.104.30.110 4am-node38 <none> <none>
memory-opt-scenes-2x7j4-3-milvus-indexnode-79bb49f75d-z89d8 1/1 Running 2 (173m ago) 173m 10.104.15.141 4am-node20 <none> <none>
memory-opt-scenes-2x7j4-3-milvus-indexnode-79bb49f75d-zq4tc 1/1 Running 2 (173m ago) 173m 10.104.20.123 4am-node22 <none> <none>
memory-opt-scenes-2x7j4-3-milvus-mixcoord-57ccd4b99c-f77lt 1/1 Running 2 (173m ago) 173m 10.104.30.111 4am-node38 <none> <none>
memory-opt-scenes-2x7j4-3-milvus-proxy-688997675f-vm9hl 1/1 Running 2 (173m ago) 173m 10.104.30.112 4am-node38 <none> <none>
memory-opt-scenes-2x7j4-3-milvus-querynode-749f88656d-l2dnl 1/1 Running 2 (173m ago) 173m 10.104.14.95 4am-node18 <none> <none>
memory-opt-scenes-2x7j4-3-milvus-querynode-749f88656d-rmnd7 1/1 Running 2 (173m ago) 173m 10.104.9.247 4am-node14 <none> <none>
memory-opt-scenes-2x7j4-3-minio-0 1/1 Running 0 173m 10.104.24.14 4am-node29 <none> <none>
memory-opt-scenes-2x7j4-3-minio-1 1/1 Running 0 173m 10.104.34.95 4am-node37 <none> <none>
memory-opt-scenes-2x7j4-3-minio-2 1/1 Running 0 173m 10.104.19.86 4am-node28 <none> <none>
memory-opt-scenes-2x7j4-3-minio-3 1/1 Running 0 173m 10.104.18.179 4am-node25 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-bookie-0 1/1 Running 0 173m 10.104.24.16 4am-node29 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-bookie-1 1/1 Running 0 173m 10.104.21.132 4am-node24 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-bookie-2 1/1 Running 0 173m 10.104.34.100 4am-node37 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-bookie-init-49bw9 0/1 Completed 0 173m 10.104.18.177 4am-node25 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-broker-0 1/1 Running 0 173m 10.104.18.176 4am-node25 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-proxy-0 1/1 Running 0 173m 10.104.21.129 4am-node24 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-pulsar-init-pqsmj 0/1 Completed 0 173m 10.104.18.175 4am-node25 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-recovery-0 1/1 Running 0 173m 10.104.5.203 4am-node12 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-zookeeper-0 1/1 Running 0 173m 10.104.34.94 4am-node37 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-zookeeper-1 1/1 Running 0 172m 10.104.23.155 4am-node27 <none> <none>
memory-opt-scenes-2x7j4-3-pulsar-zookeeper-2 1/1 Running 0 172m 10.104.19.91 4am-node28 <none> <none>
client logs: search, hybrid_search, query all raise error
[2024-11-14 05:14:16,286 - ERROR - fouram]: RPC error: [search], <MilvusException: (code=503, message=failed to search: segment lacks[segment=453917219271353555]: channel not available[channel=by-dev-rootcoord-dml_1_453917219263349003v1])>, <Time:{'RPC start': '2024-11-14 05:13:55.264303', 'RPC error': '2024-11-14 05:14:16.286368'}> (decorators.py:140)
[2024-11-14 05:14:16,286 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=503, message=failed to search: segment lacks[segment=453917219271353555]: channel not available[channel=by-dev-rootcoord-dml_1_453917219263349003v1])>, [requestId: 3eca957c-a247-11ef-8ab0-d63d32d0e24a] (api_request.py:57)
test steps:
concurrent test and calculation of RT and QPS
:purpose: `vector: memory and disk index`
verify concurrent DML & DQL scenario which has 4 float_vector fields & 16 scalar fields
:test steps:
1. create collection with fields:
'float_vector': 128dim,
'float_vector_1': 128dim,
'float_vector_2': 200dim,
'float_vector_3': 200dim,
'int8_1', 'int16_1', 'int32_1', 'int64_1', 'double_1', 'float_1', 'varchar_1', 'bool_1',
'int8_2', 'int16_2', 'int32_2', 'int64_2', 'double_2', 'float_2', 'varchar_2', 'bool_2'
2. build indexes:
HNSW: 'float_vector'
DIAKANN_IP: 'float_vector_1'
HNSW: 'float_vector_2'
DIAKANN_L2: 'float_vector_3'
scalar_default_index: 'int8_1', 'int16_1', 'int32_1', 'int64_1', 'double_1', 'float_1', 'varchar_1'
scalar_INVERTED_index: 'int8_2', 'int16_2', 'int32_2', 'int64_2', 'double_2', 'float_2', 'varchar_2', 'bool_2'
3. insert 5 million data
4. flush collection
5. build indexes again using the same params
6. load collection
7. concurrent request:
- insert
- delete
- flush
- load
- search
- hybrid_search
- query
(base.py:44)
@sunby please help to check /assign @sunby
argo task:inverted-corn-1731618000 test case name:test_inverted_locust_hnsw_ivf_sq8_dml_dql_cluster image: master-20241114-1d06d432-amd64
server:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
inverted-corn-118000-7-59-1335-etcd-0 1/1 Running 0 3h48m 10.104.25.232 4am-node30 <none> <none>
inverted-corn-118000-7-59-1335-etcd-1 1/1 Running 0 3h48m 10.104.19.130 4am-node28 <none> <none>
inverted-corn-118000-7-59-1335-etcd-2 1/1 Running 0 3h48m 10.104.24.120 4am-node29 <none> <none>
inverted-corn-118000-7-59-1335-milvus-datanode-5d5f54b44f-t2pxx 1/1 Running 1 (3h44m ago) 3h48m 10.104.14.230 4am-node18 <none> <none>
inverted-corn-118000-7-59-1335-milvus-indexnode-5978dc656b6gclb 1/1 Running 1 (3h48m ago) 3h48m 10.104.13.26 4am-node16 <none> <none>
inverted-corn-118000-7-59-1335-milvus-indexnode-5978dc656bccfvp 1/1 Running 1 (3h48m ago) 3h48m 10.104.6.180 4am-node13 <none> <none>
inverted-corn-118000-7-59-1335-milvus-mixcoord-598668594f-ltf7w 1/1 Running 1 (3h48m ago) 3h48m 10.104.6.178 4am-node13 <none> <none>
inverted-corn-118000-7-59-1335-milvus-proxy-58684d8f59-ffc2h 1/1 Running 1 (3h48m ago) 3h48m 10.104.6.181 4am-node13 <none> <none>
inverted-corn-118000-7-59-1335-milvus-querynode-75bb9f596dwlh4l 1/1 Running 1 (3h48m ago) 3h48m 10.104.6.179 4am-node13 <none> <none>
inverted-corn-118000-7-59-1335-minio-0 1/1 Running 0 3h48m 10.104.25.231 4am-node30 <none> <none>
inverted-corn-118000-7-59-1335-minio-1 1/1 Running 0 3h48m 10.104.19.129 4am-node28 <none> <none>
inverted-corn-118000-7-59-1335-minio-2 1/1 Running 0 3h48m 10.104.24.125 4am-node29 <none> <none>
inverted-corn-118000-7-59-1335-minio-3 1/1 Running 0 3h48m 10.104.16.212 4am-node21 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-bookie-0 1/1 Running 0 3h48m 10.104.32.63 4am-node39 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-bookie-1 1/1 Running 0 3h48m 10.104.23.48 4am-node27 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-bookie-2 1/1 Running 0 3h48m 10.104.24.126 4am-node29 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-bookie-init-6x8hw 0/1 Completed 0 3h48m 10.104.25.225 4am-node30 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-broker-0 1/1 Running 0 3h48m 10.104.25.226 4am-node30 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-proxy-0 1/1 Running 0 3h48m 10.104.14.229 4am-node18 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-pulsar-init-mqb7p 0/1 Completed 0 3h48m 10.104.14.231 4am-node18 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-recovery-0 1/1 Running 0 3h48m 10.104.13.31 4am-node16 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-zookeeper-0 1/1 Running 0 3h48m 10.104.25.233 4am-node30 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-zookeeper-1 1/1 Running 0 3h48m 10.104.17.80 4am-node23 <none> <none>
inverted-corn-118000-7-59-1335-pulsar-zookeeper-2 1/1 Running 0 3h47m 10.104.21.232 4am-node24 <none> <none>
client log:
[2024-11-15 00:42:09,127 - ERROR - fouram]: (api_response) : [Collection.search] <MilvusException: (code=503, message=failed to query: segment lacks[segment=453933363675746891]: channel not available[channel=by-dev-rootcoord-dml_0_453933363650626287v0])>, [requestId: 6544f6ea-a2ea-11ef-96d3-06c4b494423f] (api_request.py:57)
[2024-11-15 00:42:18,805 - ERROR - fouram]: (api_response) : [Collection.query] <MilvusException: (code=503, message=failed to query: segment lacks[segment=453933363675746891]: channel not available[channel=by-dev-rootcoord-dml_0_453933363650626287v0])>, [requestId: 6b4b1808-a2ea-11ef-96d3-06c4b494423f] (api_request.py:57)
[2024-11-15 00:43:58,003 - ERROR - fouram]: (api_response) : [Collection.hybrid_search] <MilvusException: (code=503, message=failed to search: segment lacks[segment=453933363675998103]: channel not available[channel=by-dev-rootcoord-dml_1_453933363650626287v1])>, [requestId: a66b4cd2-a2ea-11ef-96d3-06c4b494423f] (api_request.py:57)
test steps:
concurrent test and calculation of RT and QPS
:purpose: `vector: memory index`
verify concurrent DML & DQL scenario which has 2 float_vector fields & 16 scalar fields
:test steps:
1. create collection with fields:
'float_vector': 128dim,
'float_vector_1': 200dim,
'int8_1', 'int16_1', 'int32_1', 'int64_1', 'double_1', 'float_1', 'varchar_1', 'bool_1',
'int8_2', 'int16_2', 'int32_2', 'int64_2', 'double_2', 'float_2', 'varchar_2', 'bool_2'
2. build indexes:
HNSW: 'float_vector'
IVF_SQ8: 'float_vector_1'
scalar_default_index: 'int8_1', 'int16_1', 'int32_1', 'int64_1', 'double_1', 'float_1', 'varchar_1'
scalar_INVERTED_index: 'int8_2', 'int16_2', 'int32_2', 'int64_2', 'double_2', 'float_2', 'varchar_2', 'bool_2'
3. insert 5 million data
4. flush collection
5. build indexes again using the same params
6. load collection
7. concurrent request:
- insert
- delete
- flush
- load
- search
- hybrid_search
- query
@wangting0128 https://github.com/milvus-io/milvus/pull/37694 fix it. /assign @wangting0128
working on it
Is there an existing issue for this?
Environment
Current Behavior
argo task: memory-opt-scenes-7vrcm test case name: test_hybrid_search_locust_multi_ddl_dql_hybrid_search_cluster
server:
client log:
{pod=~"memory-opt-scenes-7vrcm-4-milvus-proxy-cd7f9d658-h48qv"} |~ "5d22cede769b75e1b0ea480a317e30f9|scene_hybrid_search_test_kkxiK9Kp"
server.logExpected Behavior
No response
Steps To Reproduce
Milvus Log
No response
Anything else?
client config:
server config: