Closed percygrunwald closed 7 months ago
Hmmm, by design I never included the option to use parallel delivery, I need to understand what is causing your load, if it is due to computing the hash or that it is related to locking from the main input queue (in which case threading won't help).
There is no way to do it right now, but since the code already has provisions to share an output queue, a global option like you mention may not be too difficult.
If I'd create an experimental patch would you be able to test if it has the desired effect?
Hi @grobian, thank you for your reply.
If I'd create an experimental patch would you be able to test if it has the desired effect?
Absolutely, I would happily do that.
I realize I have also missed details about the release version and platform, I will confirm these and provide some more metrics tomorrow.
Current env:
carbon-c-relay version: 3.4 (we should test with a newer release, I didn't realize we were on such an old release)
OS: Debian Jessie (kernel version 4.19)
CPU: Intel(R) Xeon(R) CPU E5-2699A v4 @ 2.40GHz
As a baseline looking at our current data, it looks like we are using are around 2.6-3.3 us of walltime per metric for these carbon_ch
outputs.
destinations.*.wallTime_us / destinations.*.sent
Based on the maximum value, we should be be able to send around 300k metrics/CPU second, which is consistent with what we observed last week.
I will try to test with a newer release tomorrow and report back if there is any change to the performance.
ok, I'll wait for that, if I have some cycles before that I'll see if I can prepare anything anyway.
any news here? :)
@grobian sorry we didn't check with a newer version, but I can't remember why. We have added a number of additional storage servers, which decreases the number of metrics per back end to a level that isn't a concern any more. We are trying to deprecate graphite, so I don't know if there will an upgrade in the future to compare with. Thank you for your responses and sorry we couldn't give you any additional data. Happy if you want to close this issue.
thanks for coming back on this, I guess graphite is on the way out on more places
From what I can see, each destination in a
carbon_ch
output block gets a single thread. We have a configuration like this:And the thread for the first output is maxed out:
This results in queues rising until the limit is reached, then metrics start to get dropped.
Is there a way that we could get multiple threads for a single destination without changing the consistent hash?
I could do something like this:
And have the reverse proxy on each storage host "merge" the results, but given that there aren't really 2 instances on the storage machine, the consistent hash view of tools like
carbonate
orbuckytools
will not be consistent with the view of the relay.Another option would be to spin up additional
carbon-c-relay
services and load balance output between them:Where each entry in
storage_writers
is another instance ofcarbon-c-relay
with the original configuration shown at the top. This seems like a really long walk to get 4 threads writing to1.2.3.4:2203
. I guess another way would be to to have multiple instances ofcarbon-c-relay
and reverse proxy all incoming metrics to them. This eliminates one instance ofcarbon-c-relay
, but we're still multiplying instances ofcarbon-c-relay
just to get multiple threads per destination. It would be nice to be able to specify the number of workers per destination like we can specify the number of dispatchers with-w
.Thank you for any suggestions.