milvus-io / milvus

A cloud-native vector database, storage for next generation AI applications
https://milvus.io
Apache License 2.0
29.5k stars 2.83k forks source link

[Bug]: [benchmark][cluster] queryNode memory usage doubled compared to a month ago #35896

Open wangting0128 opened 2 weeks ago

wangting0128 commented 2 weeks ago

Is there an existing issue for this?

Environment

- Milvus version:2.4-20240831-8b706122-amd64
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):pulsar    
- SDK version(e.g. pymilvus v2.0.0rc2):2.4.5rc7
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

same case, different image

test case name: test_inverted_locust_varchar_dml_dql_cluster

Problem image:2.4-20240831-8b706122-amd64 server:

NAME                                                              READY   STATUS         RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
inverted-corn-124400-5-25-2279-etcd-0                             1/1     Running        0               7m30s   10.104.17.187   4am-node23   <none>           <none>
inverted-corn-124400-5-25-2279-etcd-1                             1/1     Running        0               7m30s   10.104.19.173   4am-node28   <none>           <none>
inverted-corn-124400-5-25-2279-etcd-2                             1/1     Running        0               7m30s   10.104.27.70    4am-node31   <none>           <none>
inverted-corn-124400-5-25-2279-milvus-datanode-5c7c766875-kk6vd   1/1     Running        3 (6m37s ago)   7m31s   10.104.1.69     4am-node10   <none>           <none>
inverted-corn-124400-5-25-2279-milvus-indexnode-5dc85675f5mzvb2   1/1     Running        3 (6m38s ago)   7m31s   10.104.6.112    4am-node13   <none>           <none>
inverted-corn-124400-5-25-2279-milvus-mixcoord-79d6d445b4-mdwk7   1/1     Running        3 (2m30s ago)   7m31s   10.104.1.71     4am-node10   <none>           <none>
inverted-corn-124400-5-25-2279-milvus-proxy-968d9594d-hpj7g       1/1     Running        4 (2m7s ago)    7m31s   10.104.6.111    4am-node13   <none>           <none>
inverted-corn-124400-5-25-2279-milvus-querynode-76465fc7972svmb   1/1     Running        3 (6m38s ago)   7m31s   10.104.15.9     4am-node20   <none>           <none>
inverted-corn-124400-5-25-2279-minio-0                            1/1     Running        0               7m30s   10.104.17.188   4am-node23   <none>           <none>
inverted-corn-124400-5-25-2279-minio-1                            1/1     Running        0               7m30s   10.104.19.174   4am-node28   <none>           <none>
inverted-corn-124400-5-25-2279-minio-2                            1/1     Running        0               7m30s   10.104.27.71    4am-node31   <none>           <none>
inverted-corn-124400-5-25-2279-minio-3                            1/1     Running        0               7m30s   10.104.33.17    4am-node36   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-bookie-0                    1/1     Running        0               7m30s   10.104.17.189   4am-node23   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-bookie-1                    1/1     Running        0               7m30s   10.104.19.175   4am-node28   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-bookie-2                    1/1     Running        0               7m29s   10.104.27.72    4am-node31   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-bookie-init-6c9q7           0/1     Completed      0               7m30s   10.104.17.170   4am-node23   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-broker-0                    1/1     Running        0               7m30s   10.104.14.215   4am-node18   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-proxy-0                     1/1     Running        0               7m30s   10.104.13.214   4am-node16   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-pulsar-init-8sq8v           0/1     Completed      0               7m30s   10.104.19.164   4am-node28   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-recovery-0                  1/1     Running        0               7m30s   10.104.21.78    4am-node24   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-zookeeper-0                 1/1     Running        0               7m30s   10.104.17.184   4am-node23   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-zookeeper-1                 1/1     Running        0               6m32s   10.104.19.188   4am-node28   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-zookeeper-2                 1/1     Running        0               5m9s    10.104.18.234   4am-node25   <none>           <none> (base.py:261)
[2024-09-02 01:07:12,018 -  INFO - fouram]: [Cmd Exe]  kubectl get pods  -n qa-milvus  -o wide | grep -E 'NAME|inverted-corn-124400-5-25-2279-milvus|inverted-corn-124400-5-25-2279-minio|inverted-corn-124400-5-25-2279-etcd|inverted-corn-124400-5-25-2279-pulsar|inverted-corn-124400-5-25-2279-zookeeper|inverted-corn-124400-5-25-2279-kafka|inverted-corn-124400-5-25-2279-log|inverted-corn-124400-5-25-2279-tikv'  (util_cmd.py:14)
[2024-09-02 01:07:32,532 -  INFO - fouram]: [CliClient] pod details of release(inverted-corn-124400-5-25-2279): 
 I0902 01:07:13.269642     535 request.go:665] Waited for 1.168807339s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/notification.kubesphere.io/v2beta2?timeout=32s
I0902 01:07:23.269876     535 request.go:665] Waited for 3.997582549s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/autoscaling/v1?timeout=32s
NAME                                                              READY   STATUS             RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
inverted-corn-124400-5-25-2279-etcd-0                             1/1     Running            0               4h3m    10.104.17.187   4am-node23   <none>           <none>
inverted-corn-124400-5-25-2279-etcd-1                             1/1     Running            0               4h3m    10.104.19.173   4am-node28   <none>           <none>
inverted-corn-124400-5-25-2279-etcd-2                             1/1     Running            0               4h3m    10.104.27.70    4am-node31   <none>           <none>
inverted-corn-124400-5-25-2279-milvus-datanode-5c7c766875-kk6vd   1/1     Running            3 (4h2m ago)    4h3m    10.104.1.69     4am-node10   <none>           <none>
inverted-corn-124400-5-25-2279-milvus-indexnode-5dc85675f5mzvb2   1/1     Running            3 (4h2m ago)    4h3m    10.104.6.112    4am-node13   <none>           <none>
inverted-corn-124400-5-25-2279-milvus-mixcoord-79d6d445b4-mdwk7   1/1     Running            3 (3h58m ago)   4h3m    10.104.1.71     4am-node10   <none>           <none>
inverted-corn-124400-5-25-2279-milvus-proxy-968d9594d-hpj7g       1/1     Running            4 (3h58m ago)   4h3m    10.104.6.111    4am-node13   <none>           <none>
inverted-corn-124400-5-25-2279-milvus-querynode-76465fc7972svmb   0/1     CrashLoopBackOff   14 (79s ago)    4h3m    10.104.15.9     4am-node20   <none>           <none>
inverted-corn-124400-5-25-2279-minio-0                            1/1     Running            0               4h3m    10.104.17.188   4am-node23   <none>           <none>
inverted-corn-124400-5-25-2279-minio-1                            1/1     Running            0               4h3m    10.104.19.174   4am-node28   <none>           <none>
inverted-corn-124400-5-25-2279-minio-2                            1/1     Running            0               4h3m    10.104.27.71    4am-node31   <none>           <none>
inverted-corn-124400-5-25-2279-minio-3                            1/1     Running            0               4h3m    10.104.33.17    4am-node36   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-bookie-0                    1/1     Running            0               4h3m    10.104.17.189   4am-node23   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-bookie-1                    1/1     Running            0               4h3m    10.104.19.175   4am-node28   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-bookie-2                    1/1     Running            0               4h3m    10.104.27.72    4am-node31   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-bookie-init-6c9q7           0/1     Completed          0               4h3m    10.104.17.170   4am-node23   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-broker-0                    1/1     Running            0               4h3m    10.104.14.215   4am-node18   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-proxy-0                     1/1     Running            0               4h3m    10.104.13.214   4am-node16   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-pulsar-init-8sq8v           0/1     Completed          0               4h3m    10.104.19.164   4am-node28   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-recovery-0                  1/1     Running            0               4h3m    10.104.21.78    4am-node24   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-zookeeper-0                 1/1     Running            0               4h3m    10.104.17.184   4am-node23   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-zookeeper-1                 1/1     Running            0               4h2m    10.104.19.188   4am-node28   <none>           <none>
inverted-corn-124400-5-25-2279-pulsar-zookeeper-2                 1/1     Running            0               4h1m    10.104.18.234   4am-node25   <none>           <none>

queryNode memory usage ~ 30G

截屏2024-09-02 12 16 26 截屏2024-09-02 12 18 55

Normal image:2.4-20240807-b22f3a62-amd64 server:

NAME                                                              READY   STATUS      RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-w6z58-71-4517-etcd-0                                      1/1     Running     0               4h35m   10.104.27.184   4am-node31   <none>           <none>
fouramf-w6z58-71-4517-etcd-1                                      1/1     Running     0               4h35m   10.104.17.43    4am-node23   <none>           <none>
fouramf-w6z58-71-4517-etcd-2                                      1/1     Running     0               4h35m   10.104.33.215   4am-node36   <none>           <none>
fouramf-w6z58-71-4517-milvus-datanode-68c9dfbb5f-vnrxz            1/1     Running     2 (4h34m ago)   4h35m   10.104.4.230    4am-node11   <none>           <none>
fouramf-w6z58-71-4517-milvus-indexnode-84dbbc86c9-tnjxb           1/1     Running     0               4h35m   10.104.6.191    4am-node13   <none>           <none>
fouramf-w6z58-71-4517-milvus-mixcoord-6896c94d6-nlmrn             1/1     Running     2 (4h34m ago)   4h35m   10.104.14.217   4am-node18   <none>           <none>
fouramf-w6z58-71-4517-milvus-proxy-5c7bb896-9kzvn                 1/1     Running     2 (4h34m ago)   4h35m   10.104.14.218   4am-node18   <none>           <none>
fouramf-w6z58-71-4517-milvus-querynode-595768ddd9-bk7nt           1/1     Running     2 (4h34m ago)   4h35m   10.104.20.149   4am-node22   <none>           <none>
fouramf-w6z58-71-4517-minio-0                                     1/1     Running     0               4h35m   10.104.34.196   4am-node37   <none>           <none>
fouramf-w6z58-71-4517-minio-1                                     1/1     Running     0               4h35m   10.104.27.185   4am-node31   <none>           <none>
fouramf-w6z58-71-4517-minio-2                                     1/1     Running     0               4h35m   10.104.17.42    4am-node23   <none>           <none>
fouramf-w6z58-71-4517-minio-3                                     1/1     Running     0               4h35m   10.104.30.176   4am-node38   <none>           <none>
fouramf-w6z58-71-4517-pulsar-bookie-0                             1/1     Running     0               4h35m   10.104.17.41    4am-node23   <none>           <none>
fouramf-w6z58-71-4517-pulsar-bookie-1                             1/1     Running     0               4h35m   10.104.34.198   4am-node37   <none>           <none>
fouramf-w6z58-71-4517-pulsar-bookie-2                             1/1     Running     0               4h35m   10.104.24.252   4am-node29   <none>           <none>
fouramf-w6z58-71-4517-pulsar-bookie-init-rq4bj                    0/1     Completed   0               4h35m   10.104.14.219   4am-node18   <none>           <none>
fouramf-w6z58-71-4517-pulsar-broker-0                             1/1     Running     0               4h35m   10.104.14.222   4am-node18   <none>           <none>
fouramf-w6z58-71-4517-pulsar-proxy-0                              1/1     Running     0               4h35m   10.104.4.231    4am-node11   <none>           <none>
fouramf-w6z58-71-4517-pulsar-pulsar-init-tqz6b                    0/1     Completed   0               4h35m   10.104.14.221   4am-node18   <none>           <none>
fouramf-w6z58-71-4517-pulsar-recovery-0                           1/1     Running     0               4h35m   10.104.4.232    4am-node11   <none>           <none>
fouramf-w6z58-71-4517-pulsar-zookeeper-0                          1/1     Running     0               4h35m   10.104.34.195   4am-node37   <none>           <none>
fouramf-w6z58-71-4517-pulsar-zookeeper-1                          1/1     Running     0               4h34m   10.104.17.52    4am-node23   <none>           <none>
fouramf-w6z58-71-4517-pulsar-zookeeper-2                          1/1     Running     0               4h33m   10.104.23.129   4am-node27   <none>           <none>
截屏2024-09-02 12 18 17 截屏2024-09-02 12 19 29

Expected Behavior

No response

Steps To Reproduce

concurrent test and calculation of RT and QPS

        :purpose:  `varchar: different max_length`
            verify concurrent DML & DQL scenario which has 3 VARCHAR scalars fields and creating INVERTED index

        :test steps:
            1. create collection with fields:
                'float_vector': 3dim,
                'varchar_1': max_length=256, varchar_filled=True
                'varchar_2': max_length=32768, varchar_filled=True
                'varchar_3': max_length=65535, varchar_filled=True
            2. build indexes:
                IVF_FLAT: 'float_vector'
                INVERTED: 'varchar_1', 'varchar_2', 'varchar_3'
            3. insert 300k data
            4. flush collection
            5. build indexes again using the same params
            6. load collection
            7. concurrent request:
                - insert
                - delete
                - flush
                - load
                - search
                - hybrid_search
                - query

Milvus Log

No response

Anything else?

normal test result:

{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'cluster',
            'config_name': 'cluster_2c4m',
            'config': {'queryNode': {'resources': {'limits': {'cpu': '8', 'memory': '32Gi'}, 'requests': {'cpu': '8', 'memory': '32Gi'}}, 'replicas': 1},
                       'indexNode': {'resources': {'limits': {'cpu': '4.0', 'memory': '16Gi'}, 'requests': {'cpu': '3.0', 'memory': '9Gi'}}, 'replicas': 1},
                       'dataNode': {'resources': {'limits': {'cpu': '2.0', 'memory': '4Gi'}, 'requests': {'cpu': '2.0', 'memory': '3Gi'}}},
                       'cluster': {'enabled': True},
                       'pulsar': {},
                       'kafka': {},
                       'minio': {'metrics': {'podMonitor': {'enabled': True}}},
                       'etcd': {'metrics': {'enabled': True, 'podMonitor': {'enabled': True}}},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus', 'tag': '2.4-20240807-b22f3a62-amd64'}}},
            'host': 'fouramf-w6z58-71-4517-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_inverted_locust_varchar_dml_dql_cluster',
            'test_case_params': {'dataset_params': {'metric_type': 'L2',
                                                    'dim': 3,
                                                    'scalars_index': {'varchar_1': {'index_type': 'INVERTED'},
                                                                      'varchar_2': {'index_type': 'INVERTED'},
                                                                      'varchar_3': {'index_type': 'INVERTED'}},
                                                    'scalars_params': {'varchar_1': {'params': {'max_length': 256}, 'other_params': {'varchar_filled': True}},
                                                                       'varchar_2': {'params': {'max_length': 32768}, 'other_params': {'varchar_filled': True}},
                                                                       'varchar_3': {'params': {'max_length': 65535},
                                                                                     'other_params': {'varchar_filled': True}}},
                                                    'dataset_name': 'local',
                                                    'dataset_size': 300000,
                                                    'ni_per': 50},
                                 'collection_params': {'other_fields': ['varchar_1', 'varchar_2', 'varchar_3'], 'shards_num': 2},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False, 'reset_db': False},
                                 'index_params': {'index_type': 'IVF_FLAT', 'index_param': {'nlist': 1024}},
                                 'concurrent_params': {'concurrent_number': 50, 'during_time': '1h', 'interval': 20, 'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'insert',
                                                       'weight': 1,
                                                       'params': {'nb': 10,
                                                                  'timeout': 30,
                                                                  'random_id': True,
                                                                  'random_vector': True,
                                                                  'varchar_filled': False,
                                                                  'start_id': 300000,
                                                                  'check_task': 'check_response',
                                                                  'check_items': None}},
                                                      {'type': 'delete',
                                                       'weight': 1,
                                                       'params': {'expr': '',
                                                                  'delete_length': 10,
                                                                  'timeout': 30,
                                                                  'check_task': 'check_response',
                                                                  'check_items': None}},
                                                      {'type': 'flush',
                                                       'weight': 1,
                                                       'params': {'timeout': 600, 'check_task': 'check_ignore_rate_limit', 'check_items': None}},
                                                      {'type': 'load',
                                                       'weight': 1,
                                                       'params': {'replica_number': 1, 'timeout': 30, 'check_task': 'check_response', 'check_items': None}},
                                                      {'type': 'search',
                                                       'weight': 1,
                                                       'params': {'nq': 1000,
                                                                  'top_k': 1,
                                                                  'search_param': {'nprobe': 32},
                                                                  'expr': 'varchar_1 like "a%" && varchar_2 like "A%" && varchar_3 like "0%" && id > 0',
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'output_fields': None,
                                                                  'ignore_growing': False,
                                                                  'group_by_field': None,
                                                                  'timeout': 60,
                                                                  'random_data': True,
                                                                  'check_task': 'check_response',
                                                                  'check_items': None}},
                                                      {'type': 'hybrid_search',
                                                       'weight': 1,
                                                       'params': {'nq': 1,
                                                                  'top_k': 10,
                                                                  'reqs': [{'search_param': {'nprobe': 16},
                                                                            'anns_field': 'float_vector',
                                                                            'expr': 'varchar_1 like "0%"',
                                                                            'top_k': 2000},
                                                                           {'search_param': {'nprobe': 128},
                                                                            'anns_field': 'float_vector',
                                                                            'expr': 'varchar_2 like "9%"'}],
                                                                  'rerank': {'WeightedRanker': [0.5, 0.5]},
                                                                  'output_fields': ['*'],
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 60,
                                                                  'random_data': True,
                                                                  'check_task': 'check_response',
                                                                  'check_items': None}},
                                                      {'type': 'query',
                                                       'weight': 1,
                                                       'params': {'ids': None,
                                                                  'expr': 'varchar_3 like "a%" && ',
                                                                  'output_fields': ['*'],
                                                                  'offset': None,
                                                                  'limit': None,
                                                                  'ignore_growing': False,
                                                                  'partition_names': None,
                                                                  'timeout': 60,
                                                                  'consistency_level': None,
                                                                  'random_data': True,
                                                                  'random_count': 20,
                                                                  'random_range': [0, 150000.0],
                                                                  'field_name': 'id',
                                                                  'field_type': 'int64',
                                                                  'check_task': 'check_response',
                                                                  'check_items': None}}]},
            'run_id': 2024083053204289,
            'datetime': '2024-08-30 10:55:20.292026',
            'client_version': '2.2'},
 'result': {'test_result': {'index': {'RT': 24.4815, 'varchar_1': {'RT': 5.0234}, 'varchar_2': {'RT': 13.5802}, 'varchar_3': {'RT': 1.0157}},
                            'insert': {'total_time': 575.9335, 'VPS': 520.8935, 'batch_time': 0.096, 'batch': 50},
                            'flush': {'RT': 3.0227},
                            'load': {'RT': 72.2012},
                            'Locust': {'Aggregated': {'Requests': 8989,
                                                      'Fails': 0,
                                                      'RPS': 2.5,
                                                      'fail_s': 0.0,
                                                      'RT_max': 603003.88,
                                                      'RT_avg': 18927.41,
                                                      'TP50': 410.0,
                                                      'TP99': 516000.0},
                                       'delete': {'Requests': 1274,
                                                  'Fails': 0,
                                                  'RPS': 0.35,
                                                  'fail_s': 0.0,
                                                  'RT_max': 16136.14,
                                                  'RT_avg': 293.6,
                                                  'TP50': 7,
                                                  'TP99': 8600.0},
                                       'flush': {'Requests': 1244,
                                                 'Fails': 0,
                                                 'RPS': 0.35,
                                                 'fail_s': 0.0,
                                                 'RT_max': 603003.88,
                                                 'RT_avg': 130109.68,
                                                 'TP50': 22000.0,
                                                 'TP99': 602000.0},
                                       'hybrid_search': {'Requests': 1270,
                                                         'Fails': 0,
                                                         'RPS': 0.35,
                                                         'fail_s': 0.0,
                                                         'RT_max': 17466.08,
                                                         'RT_avg': 2571.54,
                                                         'TP50': 1700.0,
                                                         'TP99': 12000.0},
                                       'insert': {'Requests': 1257,
                                                  'Fails': 0,
                                                  'RPS': 0.35,
                                                  'fail_s': 0.0,
                                                  'RT_max': 16139.92,
                                                  'RT_avg': 291.77,
                                                  'TP50': 23,
                                                  'TP99': 8100.0},
                                       'load': {'Requests': 1362,
                                                'Fails': 0,
                                                'RPS': 0.38,
                                                'fail_s': 0.0,
                                                'RT_max': 19769.24,
                                                'RT_avg': 452.92,
                                                'TP50': 12,
                                                'TP99': 16000.0},
                                       'query': {'Requests': 1291,
                                                 'Fails': 0,
                                                 'RPS': 0.36,
                                                 'fail_s': 0.0,
                                                 'RT_max': 20097.58,
                                                 'RT_avg': 806.39,
                                                 'TP50': 14,
                                                 'TP99': 10000.0},
                                       'search': {'Requests': 1291,
                                                  'Fails': 0,
                                                  'RPS': 0.36,
                                                  'fail_s': 0.0,
                                                  'RT_max': 13654.05,
                                                  'RT_avg': 2027.47,
                                                  'TP50': 1500.0,
                                                  'TP99': 7300.0}}}}}
sunby commented 2 weeks ago

We previously used mmap to load raw data when the index did not already contain it. Now, this process is controlled by the queryNode.mmap.scalarField configuration. By default, queryNode.mmap.scalarField is set to false, so raw data will be loaded into memory instead, which typically consumes around 30GB.

sunby commented 2 weeks ago

Please set queryNode.mmap.scalarField to true and retest it.

wangting0128 commented 2 weeks ago

Please set queryNode.mmap.scalarField to true and retest it.

The verification result of setting queryNode.mmap.scalarField to true is as follows

argo task:fouramf-mnc77 test case name:test_inverted_locust_varchar_dml_dql_cluster image:2.4-20240902-90147b13-amd64

server:

NAME                                                          READY   STATUS      RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-mnc77-73-4449-etcd-0                                  1/1     Running     0               4h16m   10.104.32.122   4am-node39   <none>           <none>
fouramf-mnc77-73-4449-etcd-1                                  1/1     Running     0               4h16m   10.104.17.115   4am-node23   <none>           <none>
fouramf-mnc77-73-4449-etcd-2                                  1/1     Running     0               4h16m   10.104.18.19    4am-node25   <none>           <none>
fouramf-mnc77-73-4449-milvus-datanode-55cd88cc6c-n6fbm        1/1     Running     2 (4h11m ago)   4h16m   10.104.23.248   4am-node27   <none>           <none>
fouramf-mnc77-73-4449-milvus-indexnode-7c99465cfc-wjgmw       1/1     Running     1 (4h15m ago)   4h16m   10.104.20.231   4am-node22   <none>           <none>
fouramf-mnc77-73-4449-milvus-mixcoord-7956f866bb-rbs6q        1/1     Running     2 (4h11m ago)   4h16m   10.104.20.230   4am-node22   <none>           <none>
fouramf-mnc77-73-4449-milvus-proxy-7f858656b6-glnkd           1/1     Running     2 (4h11m ago)   4h16m   10.104.20.229   4am-node22   <none>           <none>
fouramf-mnc77-73-4449-milvus-querynode-5b469746c5-hx96q       1/1     Running     1 (4h15m ago)   4h16m   10.104.21.65    4am-node24   <none>           <none>
fouramf-mnc77-73-4449-minio-0                                 1/1     Running     0               4h16m   10.104.32.121   4am-node39   <none>           <none>
fouramf-mnc77-73-4449-minio-1                                 1/1     Running     0               4h16m   10.104.17.114   4am-node23   <none>           <none>
fouramf-mnc77-73-4449-minio-2                                 1/1     Running     0               4h16m   10.104.18.17    4am-node25   <none>           <none>
fouramf-mnc77-73-4449-minio-3                                 1/1     Running     0               4h16m   10.104.33.15    4am-node36   <none>           <none>
fouramf-mnc77-73-4449-pulsar-bookie-0                         1/1     Running     0               4h16m   10.104.32.123   4am-node39   <none>           <none>
fouramf-mnc77-73-4449-pulsar-bookie-1                         1/1     Running     0               4h16m   10.104.33.13    4am-node36   <none>           <none>
fouramf-mnc77-73-4449-pulsar-bookie-2                         1/1     Running     0               4h16m   10.104.17.119   4am-node23   <none>           <none>
fouramf-mnc77-73-4449-pulsar-bookie-init-2m4qx                0/1     Completed   0               4h16m   10.104.14.155   4am-node18   <none>           <none>
fouramf-mnc77-73-4449-pulsar-broker-0                         1/1     Running     0               4h16m   10.104.14.156   4am-node18   <none>           <none>
fouramf-mnc77-73-4449-pulsar-proxy-0                          1/1     Running     0               4h16m   10.104.17.110   4am-node23   <none>           <none>
fouramf-mnc77-73-4449-pulsar-pulsar-init-7k5fh                0/1     Completed   0               4h16m   10.104.32.116   4am-node39   <none>           <none>
fouramf-mnc77-73-4449-pulsar-recovery-0                       1/1     Running     0               4h16m   10.104.6.70     4am-node13   <none>           <none>
fouramf-mnc77-73-4449-pulsar-zookeeper-0                      1/1     Running     0               4h16m   10.104.17.116   4am-node23   <none>           <none>
fouramf-mnc77-73-4449-pulsar-zookeeper-1                      1/1     Running     0               4h15m   10.104.33.17    4am-node36   <none>           <none>
fouramf-mnc77-73-4449-pulsar-zookeeper-2                      1/1     Running     0               4h14m   10.104.30.7     4am-node38   <none>           <none>
截屏2024-09-03 10 45 23 截屏2024-09-03 10 45 47

After setting the configuration, verification is passed

Please help confirm whether the behavior of the newly added default configuration causing increased memory usage is by design, thanks @yanliang567 @xiaofan-luan @SimFG

xiaofan-luan commented 2 weeks ago

do we enable the old configs? maybe the old config is invalid due to the config change. we should fix that if this is a config issue

wangting0128 commented 2 weeks ago

do we enable the old configs? maybe the old config is invalid due to the config change. we should fix that if this is a config issue

@SimFG Please help look at this issue,thanks

SimFG commented 2 weeks ago

This is a bug that existed before. Regardless of whether mmap is enabled or not, for an index without raw data, additional raw data is loaded using mmap. This PR has corrected this behavior. https://github.com/milvus-io/milvus/pull/35359/files#diff-ab3e84c35be9928ca123b7d966c7d346525ab2b4080dac5006fd28a853a4b6c9 image

sunby commented 2 weeks ago

This is a bug that existed before. Regardless of whether mmap is enabled or not, for an index without raw data, additional raw data is loaded using mmap. This PR has corrected this behavior. https://github.com/milvus-io/milvus/pull/35359/files#diff-ab3e84c35be9928ca123b7d966c7d346525ab2b4080dac5006fd28a853a4b6c9 image

Actually it's not a bug, we intend to mmap load raw data to save some memory if an index does not contain it.

wangting0128 commented 2 weeks ago

This is a bug that existed before. Regardless of whether mmap is enabled or not, for an index without raw data, additional raw data is loaded using mmap. This PR has corrected this behavior. https://github.com/milvus-io/milvus/pull/35359/files#diff-ab3e84c35be9928ca123b7d966c7d346525ab2b4080dac5006fd28a853a4b6c9 image

Now that this error has been corrected, the previous scenario fails to run :<