go-graphite / go-carbon

Golang implementation of Graphite/Carbon server with classic architecture: Agent -> Cache -> Persister
MIT License
801 stars 126 forks source link

[BUG] v0.17.0 CPU usage problem #557

Open Knud3 opened 1 year ago

Knud3 commented 1 year ago

After updated from v0.16.2 to v0.17.0 CPU usage has skyrocketed. Actually I tried self build version last January and it had that same problem. So problematic commit is somewhere from release v0.16.2 to February.

Usually go-carbon pod used CPU next to nothing, but now hoarding minimum of 3.1 cores. It is running on in its own Kubernetes node with 8 CPU and 64Gb RAM.

Does my config rise any red flags?

go-carbon.conf:

[common]
max-cpu = 7

[whisper]
enabled = true
workers = 8
max-updates-per-second = 500
sparse-create = true
flock = true
hash-filenames = true
compressed = true
remove-empty-file = true
online-migration = true
online-migration-rate = 5
online-migration-global-scope = "xff,aggregationMethod,schema"

[cache]
max-size = 10000000
write-strategy = "noop"

[udp]
enabled = true
listen = ":2003"
buffer-size = 0

[tcp]
enabled = true
listen = ":2003"
buffer-size = 0

[pickle]
enabled = true
listen = ":2004"
max-message-size = 67108864
buffer-size = 0

[carbonserver]
enabled = true
max-creates-per-second = 50
metrics-as-counters = false
read-timeout = "2m0s"
write-timeout = "2m0s"
request-timeout = "2m0s"
query-cache-enabled = true
query-cache-size-mb = 32768
find-cache-enabled = true
trigram-index = true
trie-index = true
scan-frequency = "10m0s"
file-list-cache = "/var/lib/graphite/carbonserver-file-list-cache.bin"
file-list-cache-version = 2
concurrent-index = true
realtime-index = 100000000
max-inflight-requests = 0
no-service-when-index-is-not-ready = true
cache-scan = false
max-globs = 10000
fail-on-max-globs = false
max-metrics-globbed  = 10000000
max-metrics-rendered = 1000000
graphite-web-10-strict-mode = true
empty-result-ok = false

storage-schemas.conf:

[default]
pattern = .*
retentions = 5m:1d,1h:1y

storage-aggregation.conf:

[default]
pattern = .*
xFilesFactor = 0.5
aggregationMethod = average

carbonapi.yaml:

notFoundStatusCode: 404
cache:
  type: "mem"
  size_mb: 4096
  defaultTimeoutSec: 10800
  shortTimeoutSec: 60
backendCache:
  type: "mem"
  size_mb: 24576
  defaultTimeoutSec: 10800
  shortTimeoutSec: 60
cpus: 6
tz: ""
functionsConfig:
  timeShift: /etc/carbonapi/timeShift.yaml
concurency: 1000
combineMultipleTargetsInOne: true
idleConnections: 200
pidFile: ""
upstreams:
  graphite09compat: false
  keepAliveInterval: "15s"
  timeouts:
    find: "300s"
    render: "300s"
    connect: "500ms"
  backendsv2:
    backends:
      -
        groupName: "go-carbon"
        protocol: "carbonapi_v3_pb"
        lbMethod: "all"
        doMultipleRequestsIfSplit: true
        maxTries: 3
        maxBatchSize: 500
        concurrencyLimit: 0
        servers:
          - "http://carbonserver:8080"
expireDelaySec: 60
unicodeRangeTables:
  - "Latin"
  - "Common"

timeShift.yaml:

resetEndDefaultValue: true

Transition from v0.16.2 to v0.17.0: go-carbon_resources

deniszh commented 1 year ago

Hi @Knud3

I'm afraid it would be quite hard to find out problematic commit just by guessing. I see no specific red flags too. So, you would need to pprof it or try to find commit by git bisect.

Knud3 commented 1 year ago

I see. Thanks. I try it with pprof and get back with results.

wedobetter commented 7 months ago

Same issue here running in K8S requests.limit.cpu=2

go-carbon: v0.17.3/amd64

NAME                      CPU(cores)   MEMORY(bytes)   
carbon-7bccb5d9dc-5kxwt   2027m        43Mi            

When I use the sample go-carbon config in the docs I get (0.8% CPU)

NAME                      CPU(cores)   MEMORY(bytes)   
carbon-7cb9d8d577-hcx22   8m           5Mi             
deniszh commented 7 months ago

Hi @wedobetter , Could you please elaborate what's your issue a bit? From your message is a bit unclear what's the difference between two cases.