betfair / opentsp

Time Series Pipeline (TSP) is an open metric gathering and routing system.
Apache License 2.0
27 stars 5 forks source link

internal/relay: support filtering at relay level #42

Open sebastianco opened 8 years ago

sebastianco commented 8 years ago

Current implementation assumes all relays are interested in receiving the same stream of metric points. This is true when running in forwarder mode but not always true when running in aggregator mode. Supporting filtering configuration at relay level will allow one to specify which points should be forwarded/blocked for each relay.

E.g. Assuming one runs an aggregator which receives 100K points/s and is configured to forward to 4 relays. If we know that one of the relays is not interested in receiving metric X why not support filtering it out before sending the metric over the network. This can save bandwidth at the expense of additional CPU usage. The bandwidth saving is more obvious when one of the relays is interested only in metric X (which might represent a tiny fraction of the entire stream).

masiulaniec commented 8 years ago

See also "subscriber-supplied aggregator-side filtering rules" in #23. I still think that the protocol extension described there is the right way to go.

sebastianco commented 8 years ago

I would not tie the two together. #23 proposes an alternative way of configuration. We still need the functionality regardless from where an aggregator gets its relay configurations. Depending entirely on subscriber to push its filters and dedup preferences to an upstream publisher/aggregator reduces our flexibility in using this functionality in conjunction with relays which don't speak the new protocol version.