Behavior of relay while downstream servers are down

grobian / carbon-c-relay

Enhanced C implementation of Carbon relay, aggregator and rewriter

Apache License 2.0

380 stars 107 forks source link

Behavior of relay while downstream servers are down #421

Open mmueller101 opened 4 years ago

mmueller101 commented 4 years ago

While down stream servers are being rebooted is there a way for the relay to be able to queue metrics to that device until it comes back up? What is the desired behavior for the relay in theses types of situations to save metrics?

grobian commented 4 years ago

That depends on the cluster type you're using. any_of or failover typically try not to queue metrics up, but any of the xxxx_ch clusters will queue up until overflow.

mmueller101 commented 4 years ago

I am using the fnv1a cluster type. During some tests of a downstream go-carbon going down for a minute I immediately saw metrics being dropped by the relay. I was expecting behavior similar to what you stated above, metrics queuing up until go-carbon came back up? Is there something else that I should have configured to allow this type of behavior? Thanks for the quick response.

deniszh commented 4 years ago

@mmueller101 : you need to configure some gigantic buffer in relay for that. But I would recommend to use specialized proxy for that - https://github.com/leoleovich/grafsy

mmueller101 commented 4 years ago

Nice. This looks like the answer here. I'll start testing this out. Thank you @deniszh .