m3db / m3

M3 monorepo - Distributed TSDB, Aggregator and Query Engine, Prometheus Sidecar, Graphite Compatible, Metrics Platform
https://m3db.io/
Apache License 2.0
4.76k stars 453 forks source link

v15.0 m3coordinator cpu high #2389

Closed ning1875 closed 4 years ago

ning1875 commented 4 years ago

image after upgrade m3coordinator to v15.0 ,it cpu useage grow high here is my m3coordinator.yaml and it is used for promethues remote_read only

listenAddress:
  type: "config"
  value: "0.0.0.0:7201"

metrics:
  scope:
    prefix: "coordinator"
  prometheus:
    handlerPath: /metrics
    listenAddress: 0.0.0.0:7203 # until https://github.com/m3db/m3/issues/682 is resolved
  sanitization: prometheus
  samplingRate: 1.0
  extended: none
limits:
  perQuery:
    maxFetchedSeries: 10000000
    maxComputedDatapoints: 10000000
    maxFetchedDatapoints: 10000000

tagOptions:
  idScheme: quoted

clusters:
# Fill-out the following and un-comment before using, and
# make sure indent by two spaces is applied.
  - namespaces:
      - namespace: default
        retention: 360h
        type: unaggregated
    client:
      config:
        service:
          env: default_env
          zone: embedded
          service: m3db
          cacheDir: /var/lib/m3kv
          etcdClusters:
            - zone: embedded
              endpoints:
                -xxx:2379
                - xxx:2379
                - xxx:2379
#                ... etc, list only M3DB seed nodes
      writeConsistencyLevel: majority
      #readConsistencyLevel: unstrict_majority
      readConsistencyLevel: one
robskillington commented 4 years ago

@ning1875 I have a feeling you may be running across a lot of CPU cores (causing high lock contention), I would advise running one coordinator per every 4 CPU cores.

See "multiProcess" in the coordinator and query service config for running more processes in a single instance (or if you use some container orchestration platform Kubernetes, just run more coordinators and load balance between them on the same hardware). https://github.com/m3db/m3/blob/master/src/cmd/services/m3query/config/config.go

ning1875 commented 4 years ago

@ning1875 I have a feeling you may be running across a lot of CPU cores (causing high lock contention), I would advise running one coordinator per every 4 CPU cores.

See "multiProcess" in the coordinator and query service config for running more processes in a single instance (or if you use some container orchestration platform Kubernetes, just run more coordinators and load balance between them on the same hardware). https://github.com/m3db/m3/blob/master/src/cmd/services/m3query/config/config.go

Thank you for your reply 。In fact, It was caused by my improper use. At that time, m3coordinator was in the high concurrency of promethues heavy query (Load many samples per query)