milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.91k stars 2.87k forks source link

[Bug]: [benchmark][stramingNode] queryNode OOM in concurrent DQL & DML scene with shard_num=16 #36760

Closed wangting0128 closed 9 hours ago

wangting0128 commented 4 days ago

Is there an existing issue for this?

Environment

- Milvus version:master-20241009-c3d91075-amd64
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):pulsar    
- SDK version(e.g. pymilvus v2.0.0rc2):2.4.5rc7
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: fouramf-997f4 test case name: test_bitmap_locust_shard16_dql_cluster

server:

NAME                                                              READY   STATUS             RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
wt-streaming-node-shard16-etcd-0                                  1/1     Running            0                4m41s   10.104.23.87    4am-node27   <none>           <none>
wt-streaming-node-shard16-etcd-1                                  1/1     Running            0                4m41s   10.104.17.55    4am-node23   <none>           <none>
wt-streaming-node-shard16-etcd-2                                  1/1     Running            0                4m41s   10.104.18.208   4am-node25   <none>           <none>
wt-streaming-node-shard16-milvus-datanode-8454b67476-brprz        1/1     Running            1 (4m15s ago)    4m41s   10.104.4.36     4am-node11   <none>           <none>
wt-streaming-node-shard16-milvus-indexnode-65b4459795-72lxv       1/1     Running            1 (4m14s ago)    4m41s   10.104.5.12     4am-node12   <none>           <none>
wt-streaming-node-shard16-milvus-indexnode-65b4459795-8g8xx       1/1     Running            1 (4m14s ago)    4m41s   10.104.9.153    4am-node14   <none>           <none>
wt-streaming-node-shard16-milvus-indexnode-65b4459795-dhhdv       1/1     Running            1 (4m13s ago)    4m41s   10.104.6.200    4am-node13   <none>           <none>
wt-streaming-node-shard16-milvus-indexnode-65b4459795-pmgxb       1/1     Running            1 (4m15s ago)    4m41s   10.104.20.135   4am-node22   <none>           <none>
wt-streaming-node-shard16-milvus-mixcoord-7f59df8b-svbsz          1/1     Running            1 (4m13s ago)    4m41s   10.104.14.107   4am-node18   <none>           <none>
wt-streaming-node-shard16-milvus-proxy-f5db84688-tzkv7            1/1     Running            1 (4m13s ago)    4m41s   10.104.6.199    4am-node13   <none>           <none>
wt-streaming-node-shard16-milvus-querynode-85cc984c7c-8mc28       1/1     Running            0                4m41s   10.104.33.234   4am-node36   <none>           <none>
wt-streaming-node-shard16-milvus-querynode-85cc984c7c-k2nc5       1/1     Running            1 (4m13s ago)    4m41s   10.104.14.109   4am-node18   <none>           <none>
wt-streaming-node-shard16-milvus-streamingnode-6c5b5fc984-m2d7g   1/1     Running            1 (4m14s ago)    4m41s   10.104.5.9      4am-node12   <none>           <none>
wt-streaming-node-shard16-minio-0                                 1/1     Running            0                4m41s   10.104.17.54    4am-node23   <none>           <none>
wt-streaming-node-shard16-minio-1                                 1/1     Running            0                4m41s   10.104.34.154   4am-node37   <none>           <none>
wt-streaming-node-shard16-minio-2                                 1/1     Running            0                4m40s   10.104.23.91    4am-node27   <none>           <none>
wt-streaming-node-shard16-minio-3                                 1/1     Running            0                4m40s   10.104.19.23    4am-node28   <none>           <none>
wt-streaming-node-shard16-pulsar-bookie-0                         1/1     Running            0                4m41s   10.104.18.207   4am-node25   <none>           <none>
wt-streaming-node-shard16-pulsar-bookie-1                         1/1     Running            0                4m41s   10.104.25.25    4am-node30   <none>           <none>
wt-streaming-node-shard16-pulsar-bookie-2                         1/1     Running            0                4m40s   10.104.23.92    4am-node27   <none>           <none>
wt-streaming-node-shard16-pulsar-bookie-init-4g78r                0/1     Completed          0                4m41s   10.104.1.74     4am-node10   <none>           <none>
wt-streaming-node-shard16-pulsar-broker-0                         1/1     Running            0                4m41s   10.104.13.166   4am-node16   <none>           <none>
wt-streaming-node-shard16-pulsar-proxy-0                          1/1     Running            0                4m41s   10.104.13.165   4am-node16   <none>           <none>
wt-streaming-node-shard16-pulsar-pulsar-init-ldf2s                0/1     Completed          0                4m41s   10.104.34.152   4am-node37   <none>           <none>
wt-streaming-node-shard16-pulsar-recovery-0                       1/1     Running            0                4m41s   10.104.1.73     4am-node10   <none>           <none>
wt-streaming-node-shard16-pulsar-zookeeper-0                      1/1     Running            0                4m41s   10.104.19.18    4am-node28   <none>           <none>
wt-streaming-node-shard16-pulsar-zookeeper-1                      1/1     Running            0                4m      10.104.30.109   4am-node38   <none>           <none>
wt-streaming-node-shard16-pulsar-zookeeper-2                      1/1     Running            0                3m25s   10.104.34.156   4am-node37   <none>           <none>

queryNode OOM

截屏2024-10-11 11 46 53

Comparison and verification: In the case where streamingNode is not enabled, queryNode does not OOM 👇

截屏2024-10-11 11 48 47

Expected Behavior

No response

Steps To Reproduce

concurrent test and calculation of RT and QPS

        :purpose:  `primary key: INT64`, shard_num=16, DQL without expr
            1. building `BITMAP` index on all supported 12 scalar fields, hybrid index on INT64 primary key field
            2. the other 22 scalar fields build `INVERTED`, `Trie`, `STL_SORT` indexes
            3. 2 fields of different vector types
            4. search for different expressions on BITMAP index fields

        :test steps:
            1. create collection with fields:
                'float_vector': 128dim
                'sparse_float_vector': sparse_range=[1, 100] <- the range of non-zero values of a sparse vector
                'id': primary key type is INT64

                all scalar fields: varchar max_length=100, array max_capacity=11
            2. build indexes:
                IVF_SQ8: 'float_vector'
                SPARSE_WAND: 'sparse_float_vector'

                default scalar index: 'id'
                BITMAP: '*_1' all supported field names
                INVERTED: 'array_float_1', 'array_double_1', 'float_2', 'double_2', 'bool_2', 'array_int8_2',
                          'array_int16_2', 'array_int32_2', 'array_int64_2', 'array_varchar_2', 'array_bool_2',
                          'array_float_2', 'array_double_2'
                Trie: 'varchar_2'
                STL_SORT: 'float_1', 'double_1', 'int8_2', 'int16_2', 'int32_2', 'int64_2'
            3. insert 5 million data
            4. flush collection
            5. build indexes again using the same params
            6. load collection
            7. concurrent request:
                - search
                - query
                - hybrid_search

Milvus Log

No response

Anything else?

test result:

[2024-10-09 10:24:51,502 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-10-09 10:24:51,502 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-10-09 10:24:51,502 -  INFO - fouram]: grpc     hybrid_search                                                                    986     1(0.10%) |   5905       0   54056   3300 |    0.55        0.00 (stats.py:789)
[2024-10-09 10:24:51,502 -  INFO - fouram]: grpc     query                                                                            968     3(0.31%) |  37217       0  110610  30000 |    0.54        0.00 (stats.py:789)
[2024-10-09 10:24:51,502 -  INFO - fouram]: grpc     search                                                                           990    23(2.32%) |   9085       0   31028   6100 |    0.55        0.01 (stats.py:789)
[2024-10-09 10:24:51,502 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-10-09 10:24:51,502 -  INFO - fouram]:          Aggregated                                                                      2944    27(0.92%) |  17270       0  110610   9300 |    1.63        0.01 (stats.py:789)
[2024-10-09 10:24:51,502 -  INFO - fouram]:  (stats.py:790)
[2024-10-09 10:24:51,507 -  INFO - fouram]: [PerfTemplate] Report data: 
{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'cluster',
            'config_name': 'cluster_8c16m',
            'config': {'queryNode': {'resources': {'limits': {'cpu': '32.0', 'memory': '16Gi'}, 'requests': {'cpu': '17.0', 'memory': '9Gi'}}, 'replicas': 2},
                       'indexNode': {'resources': {'limits': {'cpu': '4.0', 'memory': '8Gi'}, 'requests': {'cpu': '3.0', 'memory': '5Gi'}}, 'replicas': 4},
                       'dataNode': {'resources': {'limits': {'cpu': '8.0', 'memory': '16Gi'}, 'requests': {'cpu': '5.0', 'memory': '9Gi'}}},
                       'cluster': {'enabled': True},
                       'pulsar': {'enabled': True},
                       'kafka': {},
                       'minio': {'metrics': {'podMonitor': {'enabled': True}}},
                       'etcd': {'metrics': {'enabled': True, 'podMonitor': {'enabled': True}}},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'standalone': {'messageQueue': 'pulsar'},
                       'streaming': {'enabled': True},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus', 'tag': 'master-20241009-c3d91075-amd64'}}},
            'host': 'wt-streaming-node-shard16-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_bitmap_locust_shard16_dql_cluster',
            'test_case_params': {'dataset_params': {'metric_type': 'L2',
                                                    'dim': 128,
                                                    'max_length': 100,
                                                    'scalars_index': {'int8_1': {'index_type': 'BITMAP'},
                                                                      'int16_1': {'index_type': 'BITMAP'},
                                                                      'int32_1': {'index_type': 'BITMAP'},
                                                                      'int64_1': {'index_type': 'BITMAP'},
                                                                      'varchar_1': {'index_type': 'BITMAP'},
                                                                      'bool_1': {'index_type': 'BITMAP'},
                                                                      'array_int8_1': {'index_type': 'BITMAP'},
                                                                      'array_int16_1': {'index_type': 'BITMAP'},
                                                                      'array_int32_1': {'index_type': 'BITMAP'},
                                                                      'array_int64_1': {'index_type': 'BITMAP'},
                                                                      'array_varchar_1': {'index_type': 'BITMAP'},
                                                                      'array_bool_1': {'index_type': 'BITMAP'},
                                                                      'array_float_1': {'index_type': 'INVERTED'},
                                                                      'array_double_1': {'index_type': 'INVERTED'},
                                                                      'float_2': {'index_type': 'INVERTED'},
                                                                      'double_2': {'index_type': 'INVERTED'},
                                                                      'bool_2': {'index_type': 'INVERTED'},
                                                                      'array_int8_2': {'index_type': 'INVERTED'},
                                                                      'array_int16_2': {'index_type': 'INVERTED'},
                                                                      'array_int32_2': {'index_type': 'INVERTED'},
                                                                      'array_int64_2': {'index_type': 'INVERTED'},
                                                                      'array_varchar_2': {'index_type': 'INVERTED'},
                                                                      'array_bool_2': {'index_type': 'INVERTED'},
                                                                      'array_float_2': {'index_type': 'INVERTED'},
                                                                      'array_double_2': {'index_type': 'INVERTED'},
                                                                      'varchar_2': {'index_type': 'Trie'},
                                                                      'float_1': {'index_type': 'STL_SORT'},
                                                                      'double_1': {'index_type': 'STL_SORT'},
                                                                      'int8_2': {'index_type': 'STL_SORT'},
                                                                      'int16_2': {'index_type': 'STL_SORT'},
                                                                      'int32_2': {'index_type': 'STL_SORT'},
                                                                      'int64_2': {'index_type': 'STL_SORT'}},
                                                    'vectors_index': {'sparse_float_vector': {'index_type': 'SPARSE_INVERTED_INDEX',
                                                                                              'index_param': {'drop_ratio_build': 0.2},
                                                                                              'metric_type': 'IP'}},
                                                    'scalars_params': {'array_int8_1': {'params': {'max_capacity': 11},
                                                                                        'other_params': {'dataset': 'random_algorithm',
                                                                                                         'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                              'specify_range': [-2500, 2500],
                                                                                                                              'max_capacity': 9}}},
                                                                       'array_int16_1': {'params': {'max_capacity': 11},
                                                                                         'other_params': {'dataset': 'random_algorithm',
                                                                                                          'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                               'specify_range': [-2500, 2500],
                                                                                                                               'max_capacity': 9}}},
                                                                       'array_int32_1': {'params': {'max_capacity': 11},
                                                                                         'other_params': {'dataset': 'random_algorithm',
                                                                                                          'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                               'specify_range': [-2500, 2500],
                                                                                                                               'max_capacity': 9}}},
                                                                       'array_int64_1': {'params': {'max_capacity': 11},
                                                                                         'other_params': {'dataset': 'random_algorithm',
                                                                                                          'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                               'specify_range': [-2500, 2500],
                                                                                                                               'max_capacity': 9}}},
                                                                       'array_double_1': {'params': {'max_capacity': 11}},
                                                                       'array_float_1': {'params': {'max_capacity': 11}},
                                                                       'array_varchar_1': {'params': {'max_capacity': 11},
                                                                                           'other_params': {'dataset': 'random_algorithm',
                                                                                                            'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                                 'specify_range': [-2500, 2500],
                                                                                                                                 'max_capacity': 9}}},
                                                                       'array_bool_1': {'params': {'max_capacity': 11},
                                                                                        'other_params': {'dataset': 'random_algorithm',
                                                                                                         'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                              'specify_range': [-2500, 2500],
                                                                                                                              'max_capacity': 9}}},
                                                                       'array_int8_2': {'params': {'max_capacity': 11}},
                                                                       'array_int16_2': {'params': {'max_capacity': 11}},
                                                                       'array_int32_2': {'params': {'max_capacity': 11}},
                                                                       'array_int64_2': {'params': {'max_capacity': 11}},
                                                                       'array_double_2': {'params': {'max_capacity': 11}},
                                                                       'array_float_2': {'params': {'max_capacity': 11}},
                                                                       'array_varchar_2': {'params': {'max_capacity': 11}},
                                                                       'array_bool_2': {'params': {'max_capacity': 11}},
                                                                       'int8_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                   'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                        'specify_range': [-2500, 2500],
                                                                                                                        'max_capacity': 9}}},
                                                                       'int16_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                    'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                         'specify_range': [-2500, 2500],
                                                                                                                         'max_capacity': 9}}},
                                                                       'int32_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                    'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                         'specify_range': [-2500, 2500],
                                                                                                                         'max_capacity': 9}}},
                                                                       'int64_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                    'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                         'specify_range': [-2500, 2500],
                                                                                                                         'max_capacity': 9}}},
                                                                       'varchar_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                      'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                           'specify_range': [-2500, 2500],
                                                                                                                           'max_capacity': 9}}},
                                                                       'bool_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                   'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                        'specify_range': [-2500, 2500],
                                                                                                                        'max_capacity': 9}}}},
                                                    'dataset_name': 'sift',
                                                    'dataset_size': 5000000,
                                                    'ni_per': 5000},
                                 'collection_params': {'other_fields': ['sparse_float_vector', 'int8_1', 'int16_1', 'int32_1', 'int64_1', 'double_1', 'float_1',
                                                                        'varchar_1', 'bool_1', 'json_1', 'array_int8_1', 'array_int16_1', 'array_int32_1',
                                                                        'array_int64_1', 'array_double_1', 'array_float_1', 'array_varchar_1', 'array_bool_1',
                                                                        'int8_2', 'int16_2', 'int32_2', 'int64_2', 'double_2', 'float_2', 'varchar_2', 'bool_2',
                                                                        'json_2', 'array_int8_2', 'array_int16_2', 'array_int32_2', 'array_int64_2',
                                                                        'array_double_2', 'array_float_2', 'array_varchar_2', 'array_bool_2'],
                                                       'shards_num': 16},
                                 'flush_params': {'prepare_flush': False},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False, 'reset_db': False},
                                 'index_params': {'index_type': 'IVF_SQ8', 'index_param': {'nlist': 1024}},
                                 'concurrent_params': {'concurrent_number': 30, 'during_time': '30m', 'interval': 20, 'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'search',
                                                       'weight': 1,
                                                       'params': {'nq': 1000,
                                                                  'top_k': 10,
                                                                  'search_param': {'nprobe': 16},
                                                                  'expr': 'id >= 100',
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'output_fields': ['*'],
                                                                  'ignore_growing': False,
                                                                  'group_by_field': None,
                                                                  'timeout': 30,
                                                                  'random_data': True,
                                                                  'check_task': 'check_search_output',
                                                                  'check_items': {'output_fields': ['sparse_float_vector', 'int8_1', 'int16_1', 'int32_1',
                                                                                                    'int64_1', 'double_1', 'float_1', 'varchar_1', 'bool_1',
                                                                                                    'json_1', 'array_int8_1', 'array_int16_1', 'array_int32_1',
                                                                                                    'array_int64_1', 'array_double_1', 'array_float_1',
                                                                                                    'array_varchar_1', 'array_bool_1', 'int8_2', 'int16_2',
                                                                                                    'int32_2', 'int64_2', 'double_2', 'float_2', 'varchar_2',
                                                                                                    'bool_2', 'json_2', 'array_int8_2', 'array_int16_2',
                                                                                                    'array_int32_2', 'array_int64_2', 'array_double_2',
                                                                                                    'array_float_2', 'array_varchar_2', 'array_bool_2', 'id',
                                                                                                    'float_vector'],
                                                                                  'nq': 1000}}},
                                                      {'type': 'query',
                                                       'weight': 1,
                                                       'params': {'ids': None,
                                                                  'expr': 'id > -1 && ',
                                                                  'output_fields': ['id', 'float_vector', 'int64_1'],
                                                                  'offset': None,
                                                                  'limit': None,
                                                                  'ignore_growing': False,
                                                                  'partition_names': None,
                                                                  'timeout': 30,
                                                                  'consistency_level': None,
                                                                  'random_data': True,
                                                                  'random_count': 10,
                                                                  'random_range': [0, 5000000],
                                                                  'field_name': 'id',
                                                                  'field_type': 'int64',
                                                                  'check_task': 'check_query_output',
                                                                  'check_items': None}},
                                                      {'type': 'hybrid_search',
                                                       'weight': 1,
                                                       'params': {'nq': 10,
                                                                  'top_k': 10,
                                                                  'reqs': [{'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'nprobe': 128}, 'anns_field': 'float_vector', 'top_k': 100},
                                                                           {'search_param': {'drop_ratio_search': 0.1}, 'anns_field': 'sparse_float_vector'},
                                                                           {'search_param': {'drop_ratio_search': 0.1}, 'anns_field': 'sparse_float_vector'}],
                                                                  'rerank': {'RRFRanker': []},
                                                                  'output_fields': ['*'],
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 1800,
                                                                  'random_data': True,
                                                                  'check_task': 'check_search_output',
                                                                  'check_items': {'output_fields': ['sparse_float_vector', 'int8_1', 'int16_1', 'int32_1',
                                                                                                    'int64_1', 'double_1', 'float_1', 'varchar_1', 'bool_1',
                                                                                                    'json_1', 'array_int8_1', 'array_int16_1', 'array_int32_1',
                                                                                                    'array_int64_1', 'array_double_1', 'array_float_1',
                                                                                                    'array_varchar_1', 'array_bool_1', 'int8_2', 'int16_2',
                                                                                                    'int32_2', 'int64_2', 'double_2', 'float_2', 'varchar_2',
                                                                                                    'bool_2', 'json_2', 'array_int8_2', 'array_int16_2',
                                                                                                    'array_int32_2', 'array_int64_2', 'array_double_2',
                                                                                                    'array_float_2', 'array_varchar_2', 'array_bool_2', 'id',
                                                                                                    'float_vector'],
                                                                                  'nq': 10}}}]},
            'run_id': 2024100961955372,
            'datetime': '2024-10-09 09:29:55.997726',
            'client_version': '2.2'},
 'result': {'test_result': {'index': {'RT': 0.5157,
                                      'sparse_float_vector': {'RT': 0.5148},
                                      'int8_1': {'RT': 0.5152},
                                      'int16_1': {'RT': 0.5167},
                                      'int32_1': {'RT': 0.5166},
                                      'int64_1': {'RT': 0.5169},
                                      'varchar_1': {'RT': 0.5146},
                                      'bool_1': {'RT': 0.5152},
                                      'array_int8_1': {'RT': 0.5238},
                                      'array_int16_1': {'RT': 0.514},
                                      'array_int32_1': {'RT': 0.5134},
                                      'array_int64_1': {'RT': 0.5158},
                                      'array_varchar_1': {'RT': 0.5152},
                                      'array_bool_1': {'RT': 0.5139},
                                      'array_float_1': {'RT': 0.5147},
                                      'array_double_1': {'RT': 0.5146},
                                      'float_2': {'RT': 0.5157},
                                      'double_2': {'RT': 0.5149},
                                      'bool_2': {'RT': 0.6093},
                                      'array_int8_2': {'RT': 0.5146},
                                      'array_int16_2': {'RT': 0.5329},
                                      'array_int32_2': {'RT': 0.5178},
                                      'array_int64_2': {'RT': 0.5143},
                                      'array_varchar_2': {'RT': 0.5147},
                                      'array_bool_2': {'RT': 0.5172},
                                      'array_float_2': {'RT': 0.5156},
                                      'array_double_2': {'RT': 0.5146},
                                      'varchar_2': {'RT': 0.5159},
                                      'float_1': {'RT': 0.5173},
                                      'double_1': {'RT': 0.5152},
                                      'int8_2': {'RT': 0.5141},
                                      'int16_2': {'RT': 0.5143},
                                      'int32_2': {'RT': 0.5158},
                                      'int64_2': {'RT': 0.5156}},
                            'insert': {'total_time': 882.4758, 'VPS': 5665.8777, 'batch_time': 0.8825, 'batch': 5000},
                            'load': {'RT': 8.1154},
                            'Locust': {'Aggregated': {'Requests': 2944,
                                                      'Fails': 27,
                                                      'RPS': 1.63,
                                                      'fail_s': 0.01,
                                                      'RT_max': 110610.16,
                                                      'RT_avg': 17270.34,
                                                      'TP50': 9300.0,
                                                      'TP99': 86000.0},
                                       'hybrid_search': {'Requests': 986,
                                                         'Fails': 1,
                                                         'RPS': 0.55,
                                                         'fail_s': 0.0,
                                                         'RT_max': 54056.68,
                                                         'RT_avg': 5905.07,
                                                         'TP50': 3300.0,
                                                         'TP99': 34000.0},
                                       'query': {'Requests': 968,
                                                 'Fails': 3,
                                                 'RPS': 0.54,
                                                 'fail_s': 0.0,
                                                 'RT_max': 110610.16,
                                                 'RT_avg': 37217.49,
                                                 'TP50': 30000.0,
                                                 'TP99': 102000.0},
                                       'search': {'Requests': 990,
                                                  'Fails': 23,
                                                  'RPS': 0.55,
                                                  'fail_s': 0.02,
                                                  'RT_max': 31028.89,
                                                  'RT_avg': 9085.81,
                                                  'TP50': 6100.0,
                                                  'TP99': 29000.0}}}}}
chyezh commented 3 days ago

Streaming service do not use the proportion to make the flush segment smaller. So the flush segment is 10x greater than the milvus without streaming service, easier to OOM.

chyezh commented 1 day ago

@wangting0128 should be fixed, please verify with commit f0f5147aefe581b87e30b7b144dc801d7926322e.

wangting0128 commented 9 hours ago

verification passed

argo task: verify-36760-stream-oom test image: master-20241014-d566b0ce-amd64

server:

NAME                                                              READY   STATUS             RESTARTS       AGE     IP              NODE         NOMINATED NODE   READINESS GATES
verify-36760-stream-oom-etcd-0                                    1/1     Running            0              14h     10.104.15.56    4am-node20   <none>           <none>
verify-36760-stream-oom-etcd-1                                    1/1     Running            0              14h     10.104.17.20    4am-node23   <none>           <none>
verify-36760-stream-oom-etcd-2                                    1/1     Running            0              14h     10.104.32.150   4am-node39   <none>           <none>
verify-36760-stream-oom-milvus-datanode-76f494dc4b-9twvq          1/1     Running            2 (14h ago)    14h     10.104.6.239    4am-node13   <none>           <none>
verify-36760-stream-oom-milvus-indexnode-64c946567-c74hr          1/1     Running            1 (14h ago)    14h     10.104.32.145   4am-node39   <none>           <none>
verify-36760-stream-oom-milvus-indexnode-64c946567-htct8          1/1     Running            1 (14h ago)    14h     10.104.34.226   4am-node37   <none>           <none>
verify-36760-stream-oom-milvus-indexnode-64c946567-l6dnv          1/1     Running            1 (14h ago)    14h     10.104.18.113   4am-node25   <none>           <none>
verify-36760-stream-oom-milvus-indexnode-64c946567-tmfnc          1/1     Running            2 (14h ago)    14h     10.104.9.171    4am-node14   <none>           <none>
verify-36760-stream-oom-milvus-mixcoord-758fd74c6-4xf5x           1/1     Running            2 (14h ago)    14h     10.104.15.47    4am-node20   <none>           <none>
verify-36760-stream-oom-milvus-proxy-868fccd7cb-x6knl             1/1     Running            2 (14h ago)    14h     10.104.9.169    4am-node14   <none>           <none>
verify-36760-stream-oom-milvus-querynode-65ff87fbd9-wjqpg         1/1     Running            2 (14h ago)    14h     10.104.1.24     4am-node10   <none>           <none>
verify-36760-stream-oom-milvus-querynode-65ff87fbd9-wnvq9         1/1     Running            1 (14h ago)    14h     10.104.33.113   4am-node36   <none>           <none>
verify-36760-stream-oom-milvus-streamingnode-fc9884bcb-zx45t      1/1     Running            2 (14h ago)    14h     10.104.9.170    4am-node14   <none>           <none>
verify-36760-stream-oom-minio-0                                   1/1     Running            0              14h     10.104.19.47    4am-node28   <none>           <none>
verify-36760-stream-oom-minio-1                                   1/1     Running            0              14h     10.104.17.19    4am-node23   <none>           <none>
verify-36760-stream-oom-minio-2                                   1/1     Running            0              14h     10.104.34.228   4am-node37   <none>           <none>
verify-36760-stream-oom-minio-3                                   1/1     Running            0              14h     10.104.15.57    4am-node20   <none>           <none>
verify-36760-stream-oom-pulsar-bookie-0                           1/1     Running            0              14h     10.104.19.49    4am-node28   <none>           <none>
verify-36760-stream-oom-pulsar-bookie-1                           1/1     Running            0              14h     10.104.32.152   4am-node39   <none>           <none>
verify-36760-stream-oom-pulsar-bookie-2                           1/1     Running            0              14h     10.104.15.60    4am-node20   <none>           <none>
verify-36760-stream-oom-pulsar-bookie-init-tzv98                  0/1     Completed          0              14h     10.104.15.48    4am-node20   <none>           <none>
verify-36760-stream-oom-pulsar-broker-0                           1/1     Running            0              14h     10.104.17.16    4am-node23   <none>           <none>
verify-36760-stream-oom-pulsar-proxy-0                            1/1     Running            0              14h     10.104.19.42    4am-node28   <none>           <none>
verify-36760-stream-oom-pulsar-pulsar-init-whwtk                  0/1     Completed          0              14h     10.104.19.43    4am-node28   <none>           <none>
verify-36760-stream-oom-pulsar-recovery-0                         1/1     Running            0              14h     10.104.15.46    4am-node20   <none>           <none>
verify-36760-stream-oom-pulsar-zookeeper-0                        1/1     Running            0              14h     10.104.15.55    4am-node20   <none>           <none>
verify-36760-stream-oom-pulsar-zookeeper-1                        1/1     Running            0              14h     10.104.19.51    4am-node28   <none>           <none>
verify-36760-stream-oom-pulsar-zookeeper-2                        1/1     Running            0              14h     10.104.17.22    4am-node23   <none>           <none>
截屏2024-10-15 10 50 17