Closed hessu closed 4 years ago
I'm wondering if there are any use cases where this needs to be greedy and if this will break those. Can't think of any atm.
Me too. I agreed with @hessu - at least fixed behavior is more logical.
If the right-side length is fixed (\d{3} for example, or a constant string), the patch will not change behaviour. It'll only have an effect when the right-side is variable-length, and there is some ambiguity involved, at which point it'll let the right side match more characters.
The user has control in config over the right side/postfix, but no control over the regexp matching the aggregate identifier.
Thanks for the quick merge! Probably my smallest open source contribution ever, just 2 bytes added. Also one of the quicker merges.
Describe the bug When using aggregator to aggregate metrics from nodes with a name such as "mycluster-location1" to "mycluster-location350" using a regexp with "mycluster-\<location>\d+" the rule regexp breaks it out to summaries for "location", "location1", "location2" ... up to "location35" instead of "location".
To Reproduce Configure aggregation-rule of:
aggregate.\<location>.app.requests (60) = sum collectd.mycluster-\<location>\d+.app.requests
Feed data with collectd.mycluster-helsinki150.app.requests, observe aggregated data point come out for aggregate.helsinki15.app.requests.
Expected behavior Should not include the numbers matched by \d+ in the aggregate identiifer.
The issue appears to be in lib/carbon/aggregator/rules.py, regex_part setup, it should use the '?' non-greedy modifier suffix to make the match not greedy and let the user-supplied part on the right side be greedy. I have a tested patch, I will make a pull request shortly.