go-graphite / go-carbon

Golang implementation of Graphite/Carbon server with classic architecture: Agent -> Cache -> Persister
MIT License
805 stars 123 forks source link

go-carbon high read requests. #297

Open vedaprasad-j opened 5 years ago

vedaprasad-j commented 5 years ago

Hello all,

i am doing a POC on go-carbon to replace our legacy carbon-cache setup. go-carbon is running as a container and the storage is aws EBS volume. The flow is carbon-relay-ng->go-carbon-carbonzipper->carbonapi->grafana These are the relevant configs. [common] user = "root" metric-interval = "1m0s" max-cpu = 2

[whisper] workers = 2 max-updates-per-second = 500 max-creates-per-second = 50 hard-max-creates-per-second = true sparse-create = false flock = false enabled = true hash-filenames = true

[cache] max-size = 10000 write-strategy = "max"

[carbonserver] listen = ":8080" enabled = true buckets = 100 metrics-as-counters = false read-timeout = "300s" write-timeout = "300s" query-cache-enabled = true query-cache-size-mb = 0 find-cache-enabled = true

Container host config: 16 cores and 32GB RAM.

i can see lots of reads happening on the drive along with high memory consumption ( 22g b)when there are hardly any requests coming in.

%CPU %MEM CMD 42.7 74.7 go-carbon -config /data/graphite/carbon.conf

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util nvme1n1 0.00 230.80 1296.80 4.40 28373.60 940.00 45.06 0.32 0.67 0.67 2.18 0.20 25.44 nvme1n1 0.00 273.42 982.74 528.53 11224.08 3617.63 19.64 0.69 0.71 0.68 0.78 0.17 25.33 nvme1n1 0.00 300.20 1440.00 5.40 31812.80 1221.60 45.71 0.25 0.58 0.57 2.67 0.14 20.88 nvme1n1 0.00 307.60 1748.20 115.80 41516.00 1703.20 46.37 0.42 0.61 0.62 0.54 0.17 31.28

Is there something configured wrong on my end? Any help would be highly appreciated.

Thanks in advance, Ved.

Civil commented 5 years ago

Read load likely caused by carbonserver. By default trigram-index is enabled and to construct it it scan filesystem once in a while. Also trigram index is in-memory but it will speedup queries with globs in them (plus some stats, like access stat is based on some of the helper structures that trigram index use). So it's a tradeoff. Likely if you'll explicitly disable it, it will easier on your disk.

Also carbonserver is only required to use "CLUSTER_SERVER" option on graphite-web side or if you want to use carbonapi. If you use graphite-web on the same VM, you can just configure it to read whisper files by itself from the same storage dir and use carbonlink protocol to get the most recent data out of go-carbon.

Please also note that carbonserver do not support graphite-tags as of now.

azhiltsov commented 5 years ago

@vedaprasad-j What amount of unique metric names do you have on the server ? How many whisper files on the disk are getting updates within 5 minutes (roughly in % from above) How many data-points does this server receive per second in average? Did you follow the tuning recommendation from here ? What filesystem do you use for whispers? Do you see a sign of throttling in the log? 'metric creation throttled' with "dropped":true What is the file_scan_runtime from the logs?

Can you also provide the output of cat /proc/slabinfo cat /proc/meminfo ps -C go-carbon -o vsz,rss,%mem,cmd grep retentions /etc/carbon/storage-schemas.conf

vedaprasad-j commented 5 years ago

@azhiltsov : i disabled carbonserver and enabled carbonlink and have graphite-web read from the whisper files directly along with carbonlink port.The memory and reads have reduced now. unfortunately i do not have the ec2 instance which was running go-carbon with carbonserver enabled.

Below are the details with carbonserver disabled.

The servers receives 25K requests per second Yes, i have followed the tuning recommendation mentioned. we use ext4 for the whisper files. i was seeing "metric creation throttled' with "dropped":true" in the logs when carbonserver was enabled. grep -i retentions /etc/go-carbon/storage-schemas.conf RETENTIONS = 60s:1d,5m:7d RETENTIONS = 60s:1d,5m:7d,1h:60d

ps -C go-carbon -o vsz,rss,%mem,cmd VSZ RSS %MEM CMD 1797968 1641376 21.0 go-carbon -config /data/graphite/carbon.conf

I am most likely to go with carbonserver disabled and use graphite-web in production as this setup is stable from the past few days.

Thanks for the help and feedback. !