grobian / carbon-c-relay

Enhanced C implementation of Carbon relay, aggregator and rewriter
Apache License 2.0
380 stars 107 forks source link

Multiple connections per destination #315

Open azhiltsov opened 7 years ago

azhiltsov commented 7 years ago

I am wondering is it hard to implement multiple connections per destination, so it would not try to push everything via one tcp connection. While using hashing can I use the same host:port:hash trio in order to do so?

grobian commented 7 years ago

You can't at the moment. I wonder why you feel this is necessary though?

azhiltsov commented 7 years ago

sending above 1M points/sec is hitting a limit on the receiver. Both carbon-c-relay and go-carbon barely cope with it via one connection. Multiple connections is the easiest way to fix it. As I understand each connection is processed in its own thread and this imposes the limit.

grobian commented 7 years ago

hmmm, and how would you like to control the amount of connections to use per destination?

azhiltsov commented 7 years ago

config parameter, defaulted to 1?

ginal commented 6 years ago

I am facing the same issue. Pushing millions of metrics per minute through a single tcp connection puts a lot of stress both at the receiving end (carbon-c-relays) and also at the load balancer that is in front of them (in my use case). In previous versions (3.0 and 1.x) this could be achieved using a hack: by defining multiple hostnames for a single IP in the hosts file, and adding each of those aliases in carbon-c-relay.conf in an "any_of" destination, multiple connections would be opened to the same IP. In 3.2 this is fixed and carbon refuses to start throwing a config error ("cannot share server carbon1:2003 with any_of/failover cluster").

jaroslawr commented 3 years ago

I face the same issue, having one connection to a destination results in also having only one thread doing the write()'s to the connection, the thread saturates its CPU core and becomes a bottleneck. Since this is with both carbon-c-relay and destination server on same machine, I worked around it by doing:

cluster xyz
    any_of
        127.0.0.1:1234
        127.0.0.2:1234
        127.0.0.3:1234
        127.0.0.4:1234
    ;

Would be nice to simply have an option to spawn multiple connections/threads writing to the same destination like @azhiltsov suggested. I think the write() thread might also have lower than optimal throughput because it does a seperate write() for each single metric.