Best practices for running carbon-c-relay in K8s

ervikrant06 commented 3 years ago

This is not an issue but general query. I have gone through some of the past git issues and see discussion related to K8s but none of them is answering my query:

I want to know about recommended settings for running carbon-c-relay in K8s environment to achieve max performance out of it. We are replacing our existing python graphite stack with go stack and carbon-c-relay. I am planning to reserve a dedicated node for running carbon-c-relay. I know it's difficult to answer performance related questions but just want to make a right start with initial configuration..

Is -w option automatically detects the number of cores reserved (requests/limit) for POD? I am using default dockerfile provided in project which only use -f option. I can overwride -w according to number of cores if carbon-c-relay doesn't scale workers in K8s setup. Currently I am planning to reserve 4 cores and 8GB of memory.
What's more preferrable running multiple instances of carbon-c-relay or a single big carbon-c-relay POD instance?
Is this conf sufficient enough to take care of 1Mil metric per min? We don't want any replication but want to distribute the traffic across three nodes running multiple instances of go-graphite again as a POD?

cluster graphite
        any_of
                go-graphite-svc-node1:2003
                go-graphite-svc-node2:2003
                go-graphite-svc-node3:2003
    ;
listen
        type linemode
                2003 proto tcp
   ;
match
    *
    send to graphite
  ;

Any other tuning parameter which needs to be adjusted
Also I don't see cpu and memory utilization internal stats available with carbon-c-relay these stats were available with graphite-statsd relay implementation. Any way to enable cpu/memory utilization stats for carbon-c-relay and dumping them to go-graphite whisper DB.

ervikrant06 commented 3 years ago

Seems to be an issue in detecting right number of workers:

Started POD with default docker file as my machine has 28 physical cores hence it's showing 28 workers looks expected.

/ # ps aux
PID   USER     TIME  COMMAND
    1 root     14:27 /usr/bin/carbon-c-relay -f /etc/carbon-c-relay/carbon-c-relay.conf
  524 root      0:00 sh
  530 root      0:00 ps aux

While running in test mode 

configuration:
    relay hostname = graphite-c-relay-pod-69cdf6c44-67p8s
    workers = 28
    send batch size = 2500
    server queue size = 25000
    server max stalls = 4
    listen backlog = 32
    server connection IO timeout = 600ms
    idle connections disconnect timeout = 10m
    configuration = /etc/carbon-c-relay/carbon-c-relay.conf

Limited the worker count to 4 but still in test configuration it's showing me 28 workers which looks wrong?

/ # ps aux
PID   USER     TIME  COMMAND
    1 root      0:00 /usr/bin/carbon-c-relay -f /etc/carbon-c-relay/carbon-c-relay.conf -w 4 -l /var/log/graphite-c-relay.log
   34 root      0:00 sh
   40 root      0:00 ps aux

configuration:
    relay hostname = graphite-c-relay4w-pod-7b4fc6f87f-hflgz
    workers = 28
    send batch size = 2500
    server queue size = 25000
    server max stalls = 4
    listen backlog = 32
    server connection IO timeout = 600ms
    idle connections disconnect timeout = 10m
    configuration = /etc/carbon-c-relay/carbon-c-relay.conf

grobian commented 3 years ago

I don't know how you checked that, but I assume by checking /var/log/graphite-c-relay.log?

Many of your questions depend on your scenario; if you need redundancy, etc. On a local host, it is usually much simpler to just run a single instance if you can.

ervikrant06 commented 3 years ago

let me rephrase my question:

Is carbon-c-relay while running in K8s pod starts the workers equal to the number of the cores preset on physical machine or based on the cpu cores allocated to POD?

Also is number of dispatchers indicate the number of workers?

grobian commented 3 years ago

My experiences with kubernetes are very limited, but carbon-c-relay looks at the number of CPUs via sysctl, so likely it gets the host amount of CPUs, and not those assigned to the pod.

A dispatcher is more or less a worker.

grobian commented 3 years ago

In your case, I'd force the amount of workers to be the number you want (e.g. what's assigned to the POD) via using the -w option via your configuration management.

grobian / carbon-c-relay

Best practices for running carbon-c-relay in K8s #426