grobian / carbon-c-relay

Enhanced C implementation of Carbon relay, aggregator and rewriter
Apache License 2.0
380 stars 107 forks source link

Best practices for running carbon-c-relay in K8s #426

Closed ervikrant06 closed 4 months ago

ervikrant06 commented 3 years ago

This is not an issue but general query. I have gone through some of the past git issues and see discussion related to K8s but none of them is answering my query:

I want to know about recommended settings for running carbon-c-relay in K8s environment to achieve max performance out of it. We are replacing our existing python graphite stack with go stack and carbon-c-relay. I am planning to reserve a dedicated node for running carbon-c-relay. I know it's difficult to answer performance related questions but just want to make a right start with initial configuration..

cluster graphite
        any_of
                go-graphite-svc-node1:2003
                go-graphite-svc-node2:2003
                go-graphite-svc-node3:2003
    ;
listen
        type linemode
                2003 proto tcp
   ;
match
    *
    send to graphite
  ;
ervikrant06 commented 3 years ago

Seems to be an issue in detecting right number of workers:

Started POD with default docker file as my machine has 28 physical cores hence it's showing 28 workers looks expected.

/ # ps aux
PID   USER     TIME  COMMAND
    1 root     14:27 /usr/bin/carbon-c-relay -f /etc/carbon-c-relay/carbon-c-relay.conf
  524 root      0:00 sh
  530 root      0:00 ps aux

While running in test mode 

configuration:
    relay hostname = graphite-c-relay-pod-69cdf6c44-67p8s
    workers = 28
    send batch size = 2500
    server queue size = 25000
    server max stalls = 4
    listen backlog = 32
    server connection IO timeout = 600ms
    idle connections disconnect timeout = 10m
    configuration = /etc/carbon-c-relay/carbon-c-relay.conf

Limited the worker count to 4 but still in test configuration it's showing me 28 workers which looks wrong?

/ # ps aux
PID   USER     TIME  COMMAND
    1 root      0:00 /usr/bin/carbon-c-relay -f /etc/carbon-c-relay/carbon-c-relay.conf -w 4 -l /var/log/graphite-c-relay.log
   34 root      0:00 sh
   40 root      0:00 ps aux

configuration:
    relay hostname = graphite-c-relay4w-pod-7b4fc6f87f-hflgz
    workers = 28
    send batch size = 2500
    server queue size = 25000
    server max stalls = 4
    listen backlog = 32
    server connection IO timeout = 600ms
    idle connections disconnect timeout = 10m
    configuration = /etc/carbon-c-relay/carbon-c-relay.conf
grobian commented 3 years ago

I don't know how you checked that, but I assume by checking /var/log/graphite-c-relay.log?

Many of your questions depend on your scenario; if you need redundancy, etc. On a local host, it is usually much simpler to just run a single instance if you can.

ervikrant06 commented 3 years ago

let me rephrase my question:

Is carbon-c-relay while running in K8s pod starts the workers equal to the number of the cores preset on physical machine or based on the cpu cores allocated to POD?

Also is number of dispatchers indicate the number of workers?

grobian commented 3 years ago

My experiences with kubernetes are very limited, but carbon-c-relay looks at the number of CPUs via sysctl, so likely it gets the host amount of CPUs, and not those assigned to the pod.

A dispatcher is more or less a worker.

grobian commented 3 years ago

In your case, I'd force the amount of workers to be the number you want (e.g. what's assigned to the POD) via using the -w option via your configuration management.