grobian / carbon-c-relay

Enhanced C implementation of Carbon relay, aggregator and rewriter
Apache License 2.0
380 stars 107 forks source link

Match rule regexp doesn't process last tag when tag support is activated. #398

Open lexx-bright opened 4 years ago

lexx-bright commented 4 years ago

When tag support is activated match rule regexp processing is performed up until last ";". So if metrics look like metric.example;tag1=value1;tag2=value2;tag3=value3 you can't route based on tag3=value3, because only "metric.example;tag1=value1;tag2=value2" substring is taken to account.

grobian commented 4 years ago

hmm, ok, I didn't know this was possible

grobian commented 4 years ago

I think you want something to match the tags, which currently doesn't exist. What you experienced is actually a bug, the routing for the metrics was messed up. Tags by the current design are forwarded, but not handled in any way. If you give a description of what you need, I might be able to implement something.

lexx-bright commented 4 years ago

I want to be able route metrics based on tags. For example:

match “env=prod” send to ch_prod stop ; match “env=test” send to ch_test stop ; And it does work until env tag is final.

lexx-bright commented 4 years ago

Could you share your concerns about tags to be a part of a metric name? As a workaround I've added ";=" to the list of allowed symbols and got desired behavior. But could it potentially lead to some problems?

grobian commented 4 years ago

The main concern is routing matters for consistent-hash based clusters. Metric a.b;foo=1 should be routed to the same node as a.b;foo=2, because routing should only take into account a.b. This is what the code does, basically ignore the part after ;, but still forward it to the destination. In the current code this goes wrong if there are multiple ';'s, because it picks the last, instead of the first. As a result, you can match all tags but the last. What I think you need is something like

match * tag env=(dev|pre) send to dev_cluster stop;
match * tag env=prod customer=A send to prodA_cluster stop;
match * tag env=prod send to prod_cluster stop;

Now I just made up that syntax. Idea is that you can match the tag names, and their values via regex, to control the routing. Alternative is something like:

match-full env=prod send to prod_cluster stop;

where it would consider the full input string up to the first space, as you've been doing now. The latter approach is likely easier to implement on this end.