carbon-c-relay not distributing the metrics equally among all go-carbon PODs

ervikrant06 commented 3 years ago

Following pic showing that each physical node is receiving approx 1.4M of metrics but three PODs are running on each physical node .. PODs are not sharing equal load. ex: go-graphite-node-2 POD ending with 6zgs2 is the only POD receiving all metrics and rest of two PODs running graphite-node-2 doesn't received any metric. And for other two physical nodes two PODs are sharing unequal metrics and third POD of each node is not doing anything.

Shared conf and setup details in https://github.com/grobian/carbon-c-relay/issues/427

grobian commented 3 years ago

That's very well possible, any reason why you need to use a consistent hash? Try using any_of, IIRC that may have a better distribution, because it doesn't tie itself to a consistent hashing ring. If you need the consistency, then consider assigning more distinct names using the =.

ervikrant06 commented 3 years ago

Earlier I tried any_of then I switched to use consistent hash but nothing helped to distrubte the traffic evenly on PODs. as I am running the go-carbon as PODs (cattles) hence it will not be possible for me to specify the direct name in conf. I am using the K8s service name and each service name is acting like a facade for 3 go-carbon PODs running on same node.

It's showing me that metrics are equally distributed among three K8s services.

But when the traffic is fwded from K8s service to go-carbon PODs I am seeing huge imbalance. out of 9 go-carbon PODs only 7 are receiving the traffic.

Just for my understanding: if someone is starting with let's say 6 PODs distrubted equally on 3 nodes (2 POD on each node). if we scale PODs to 9 (3 POD on each node) Whether newly added POD on each node will automatically start sharing the load or we need to do something manual?

grobian commented 3 years ago

I don't quite understand your setup (probably me).

I'm assuming you have a main influx of metrics, that goes to carbon-c-relay. c-relay wil then distribute the metrics over the available PODs. Each pod runs a go-carbon storage server. Your problem being the amount of metrics you see incoming on every pod is very much out of balance.

If this is your setup, the any_of routing hash will look at the input metric name to determine where it needs to go. Can it be that your input metrics are skewed somehow? E.g. a lot of values for the same metric, or something like that?

ervikrant06 commented 3 years ago

Sorry may be I haven't done a good job in explaining my setup. Let me do another attempt:

1) Three instances of carbon-c-relay are running; these instances are behaving K8s service endpoint. Each of the instance is configured to distribute traffic across three K8 services (go-graphite-svc-node{1,2,3}). Each K8 service (ex : go-graphite-svc-node1) is having 3 backend go-graphite PODs running on same node (ex: go-graphite-node-1- from below output). Similarly go-graphite-svc-node2 backends are go-graphite-node-2- PODs. All the PODs running same on node are using common underlying NVMe disk.

Last column is the physical POD on which POD (first column) is running. 

NAME                                   READY   STATUS    RESTARTS   AGE   IP             NODE
go-carbonapi-5b55d9d8d7-9nhjk          1/1     Running   0          5d    172.16.43.30   kube1srv029
go-carbonapi-5b55d9d8d7-nzxhd          1/1     Running   0          5d    172.16.43.29   kube1srv029
go-carbonapi-5b55d9d8d7-zv6x7          1/1     Running   0          5d    172.16.43.31   kube1srv029
go-graphite-node-1-74d7775546-xh5mp    2/2     Running   0          5d    172.16.32.14   kube1srv024
go-graphite-node-1-74d7775546-z98mp    2/2     Running   0          5d    172.16.32.16   kube1srv024
go-graphite-node-1-74d7775546-zzqg7    2/2     Running   0          5d    172.16.32.15   kube1srv024
go-graphite-node-2-664864d54d-6w28k    2/2     Running   0          2d    172.16.41.15   kube1srv026
go-graphite-node-2-664864d54d-rnjzl    2/2     Running   0          2d    172.16.41.16   kube1srv026
go-graphite-node-2-664864d54d-v54k2    2/2     Running   0          2d    172.16.41.14   kube1srv026
go-graphite-node-3-5cf86f698-6twhc     2/2     Running   0          10d   172.16.33.12   kube1srv027
go-graphite-node-3-5cf86f698-ldft8     2/2     Running   0          5d    172.16.33.13   kube1srv027
go-graphite-node-3-5cf86f698-tlzck     2/2     Running   0          10d   172.16.33.11   kube1srv027
graphite-c-relay-pod-c894f454d-4scst   1/1     Running   0          2d    172.16.42.24   kube1srv028
graphite-c-relay-pod-c894f454d-7bbw7   1/1     Running   0          2d    172.16.42.26   kube1srv028
graphite-c-relay-pod-c894f454d-j9z7k   1/1     Running   0          2d    172.16.42.25   kube1srv028

2) if we look at the distribution of metrics from the carbon-c-relay to K8s svc it's balanced. approx 2M metrics sent to each K8s service.

3) But when I look at the metric distribution at go-carbon POD level I see huge imbalance I was expecting that each POD should be handling approx 700K metrics (assuming total metrics are 6M distributed equally amount 9 PODs) .

Example carbon-c-relay conf.

cluster graphite
        any_of
                go-graphite-svc-node1:2003
                go-graphite-svc-node2:2003
                go-graphite-svc-node3:2003
    ;
listen
        type linemode
                2003 proto tcp
   ;
match
    *
    send to graphite
  ;

Major of our metrics are of form:

dir1.dir2.dir3.dir4..dir5 date

grobian commented 3 years ago

so, you basically have 3x the following:

a) metrics -> carbon-c-relay -> go-graphitesvc{1,2,3}
b) ... -> go-graphitesvc1 -> backend{1,2,3}

You mention a) seems to produce a fair distribution of metrics, yet b) seems imbalanced.

What I don't understand yet, is how b) is distributed. Is carbon-c-relay used there? or is there something else performing the metrics distribution?

grobian / carbon-c-relay

carbon-c-relay not distributing the metrics equally among all go-carbon PODs #428