Closed adf55 closed 4 years ago
Hi @adf55
If the relays/aggregators sit on different hosts, how are the aggregation-rules.conf shared and read from by the relays?
User (i.e. you) should care about that.
Even when the relays have access to the independent aggregation-rules.conf on each aggregator, how does each relay know which aggregator host to send a particular metric?
It uses rules from aggregation-rules.conf when calculating router using consistent hash. I can only address you to the code if you want to know gory details, or to @slackhappy - he's author of original PR https://github.com/graphite-project/carbon/pull/32
Thanks so much @deniszh.
To clarify then - I should be manually copying aggregation-rules.conf
from aggregators to relays supporting aggregated consistent hashing.
(Even after looking at code) - I'm still slightly confused about the second point. Given that a relay connects to several downstream aggregators with independent rules, it will have access to multiple aggregation-rules.conf. How does it know which rules map to which aggregator machine? Nowhere in aggregation-rules.conf is it specified which host is handling a rule
After looking at the code some more - I've realized this is how things work (for future users of this feature):
The relays do not match aggregation rules to a host intelligently. In fact, how they select the host is essentially pseudorandom (by hash). All this feature guarantees is that the same aggregated metric is sent to the same host deterministically.
What this means is that if relay A is writing to downstream aggregators X, Y, Z, you must ensure that X, Y, Z share the same aggregation rules. This is because you cannot guarantee which aggregator metrics will be sent to. Then their shared aggregation-rules.conf must be shared with relay A.
What this also means is that A must only write to a group of aggregators (i.e no regular caches interspersed), because the regular caches cannot aggregate metrics and again, you cannot guarantee which host the aggregated metric will hash to.
According to the example carbon.conf,
aggregated-consistent-hashing
accomplishes the following:My questions:
I'm assuming that each aggregator is responsible for different subsets of metrics.
(For example, say aggregator host A is responsible for metric X, then every relay host must understand that metrics matching some pattern must be sent only to A). I don't see how this is specified anywhere.