[Bug]: Querynode oomkilled when concurrent upserting data into 1024 partitions

ThreadDao commented 4 months ago

Is there an existing issue for this?

[X] I have searched the existing issues

Environment

- Milvus version: 2.4-20240621-7d1d5a83-amd64
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka):   pulsar  
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

deploy milvus with config

  components:
    dataNode:
      replicas: 1
      resources:
        limits:
          cpu: "8"
          memory: 16Gi
        requests:
          cpu: "4"
          memory: 8Gi
    indexNode:
      replicas: 3
      resources:
        limits:
          cpu: "8"
          memory: 8Gi
        requests:
          cpu: "4"
          memory: 2Gi
    mixCoord:
      replicas: 1
      resources:
        limits:
          cpu: "4"
          memory: 16Gi
        requests:
          cpu: "2" 
          memory: 8Gi 
    proxy:
      resources:
        limits:
          cpu: "1" 
          memory: 8Gi 
    queryNode:
      replicas: 2
      resources:
        limits:
          cpu: "16"
          memory: 72Gi
        requests:
          cpu: "4" 
          memory: 64Gi
  config:
    dataCoord:
      segment:
        sealProportion: 1.52e-05
    log:
      level: debug
    trace:
      exporter: jaeger
      jaeger:
        url: http://tempo-distributor.tempo:14268/api/traces
      sampleFraction: 1

test steps

create a collection with 1 shard, enable partition-key with 1024 partitions
create hnsw index {'index_type': 'HNSW', 'metric_type': 'L2', 'params': {'M': 8, 'efConstruction': 200}}
insert 10m-128d entities -> flush

concurrent requests: search + upsert + flush

'client': {'test_case_type': 'ConcurrentClientBase',
        'test_case_name': 'test_concurrent_locust_custom_parameters',
        'test_case_params': {'dataset_params': {'metric_type': 'L2', 
                                                'dim': 128,
                                                'scalars_params': {'int64_1': {'params': {'is_partition_key': True}}},
                                                'dataset_name': 'sift',
                                                'dataset_size': '10m',
                                                'ni_per': 50000},
                             'collection_params': {'other_fields': ['int64_1'],
                                                   'shards_num': 1,
                                                   'num_partitions': 1024},
                             'load_params': {},
                             'release_params': {'release_of_reload': False},
                             'index_params': {'index_type': 'HNSW',
                                              'index_param': {'M': 8,
                                                              'efConstruction': 200}},
                             'concurrent_params': {'concurrent_number': 30,
                                                   'during_time': '3h', 
                                                   'interval': 20,
                                                   'spawn_rate': None},
                             'concurrent_tasks': [{'type': 'search',
                                                   'weight': 10,
                                                   'params': {'nq': 100,
                                                              'top_k': 100,
                                                              'output_fields': ['int64_1'],
                                                              'search_param': {'ef': 128}, 
                                                              'timeout': 120}},
                                                  {'type': 'flush',
                                                   'weight': 1,
                                                   'params': {'timeout': 120}},
                                                  {'type': 'upsert',
                                                   'weight': 19,
                                                   'params': {'nb': 200,
                                                              'timeout': 120,
                                                              'start_id': 0,
                                                              'random_id': True, 
                                                              'random_vector': True}}]},
        'run_id': 2024062191801273,
        'datetime': '2024-06-21 03:06:20.115933',
        'client_version': '2.2'},

queryNode oomkilled

The qn oomkilled after about two minutes of concurrent requests, around at 2024-06-21 03:40:52

grafana: metrics of compact-opt-flush3
prroscope: alloc_objects of compact-opt-flush3-milvus-querynode

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

argo: https://argo-workflows.zilliz.cc/archived-workflows/qa/9a8be7d2-5fac-42d1-88c2-048d2ab86a64?nodeId=compact-opt-1024-with-flush-3
client log: 249-nas /test/fouram/log/2024_06_21/compact-opt-1024-with-flush-3-1_57416

pods:


compact-opt-flush3-milvus-datanode-5d9d59d55d-s58tn               1/1     Running                           0               5h27m   10.104.16.107   4am-node21   <none>           <none>
compact-opt-flush3-milvus-indexnode-5ff8f4ff46-5mxlk              1/1     Running                           0               5h27m   10.104.30.6     4am-node38   <none>           <none>
compact-opt-flush3-milvus-indexnode-5ff8f4ff46-7fnh8              1/1     Running                           0               5h27m   10.104.34.192   4am-node37   <none>           <none>
compact-opt-flush3-milvus-indexnode-5ff8f4ff46-nf9bw              1/1     Running                           0               5h27m   10.104.16.108   4am-node21   <none>           <none>
compact-opt-flush3-milvus-mixcoord-8458f66976-qd5n5               1/1     Running                           0               5h27m   10.104.1.130    4am-node10   <none>           <none>
compact-opt-flush3-milvus-proxy-57cdc4f669-nxpwb                  1/1     Running                           0               5h27m   10.104.18.103   4am-node25   <none>           <none>
compact-opt-flush3-milvus-querynode-0-67fdd8499f-nn6pg            1/1     Running                           2 (3h24m ago)   5h27m   10.104.14.127   4am-node18   <none>           <none>
compact-opt-flush3-milvus-querynode-0-67fdd8499f-w8w4l            1/1     Running                           3 (133m ago)    5h27m   10.104.26.169   4am-node32   <none>           <none>



### Anything else?

_No response_

xiaofan-luan commented 4 months ago

With so many partitions, we might need to change concurrency of compaction and more datanodes. Currently I think if we can dd more datanodes and catchup the compaction then it work for us

XuanYang-cn commented 4 months ago

Even though there're 50K segment, the thing is why 2 * 64G querynode cannot hold 7GB data in memory.

XuanYang-cn commented 1 week ago

/assign @ThreadDao /unassign

Is this reproducing?

xiaofan-luan commented 1 week ago

can we reproduce this still? I thought this might due to flush can not catch up and we need to improve flush performance

milvus-io / milvus