graphite-project / carbon

Carbon is one of the components of Graphite, and is responsible for receiving metrics over the network and writing them down to disk using a storage backend.
http://graphite.readthedocs.org/
Apache License 2.0
1.5k stars 490 forks source link

sanitize names when using them as tag value #858

Closed replay closed 4 years ago

replay commented 5 years ago

related: https://github.com/graphite-project/graphite-web/pull/2458

codecov-io commented 5 years ago

Codecov Report

Merging #858 into master will decrease coverage by <.01%. The diff coverage is 48%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #858      +/-   ##
==========================================
- Coverage   50.15%   50.15%   -0.01%     
==========================================
  Files          37       37              
  Lines        3483     3505      +22     
  Branches      495      501       +6     
==========================================
+ Hits         1747     1758      +11     
- Misses       1625     1632       +7     
- Partials      111      115       +4
Impacted Files Coverage Δ
lib/carbon/util.py 49.82% <48%> (+0.01%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 8af0b4f...3195cdd. Read the comment docs.

DanCech commented 5 years ago

The biggest concern here is the impact that this will have on anyone who is currently using series with a ~ character in the name, since those series will be changed. We could reduce that impact if we reduce the scope here to only stripping leading ~ signs, since technically that is the only place that it actually impacts the tagging system.

replay commented 5 years ago

Yeah, we could do that. Then i think I should update the docs accordingly, and update the validation function in raintank/schema as well to keep everything consistent

replay commented 5 years ago

Another question: What do we do if a metric name is exactly ~ and nothing else? We have a rule that says tag values must be >=1 char long in https://github.com/graphite-project/graphite-web/blob/master/docs/tags.rst#carbon We could just add it to the index with tag name="", but that would only be queriable via regex because the query name= has another meaning. In the case of Metrictank we could also consider that metric to have "invalid tags", and there's already a setting to define whether metrics with invalid tags should still get indexed via their other tags and the name or not, but in the case of carbon I'm not sure if there's a nice way to reject a metric name.

DanCech commented 5 years ago

The only real option available here would be to drop it as invalid afaik

replay commented 5 years ago

I updated the docs accordingly https://github.com/graphite-project/graphite-web/pull/2458/files

replay commented 5 years ago

Currently, if the resulting string is "", it would raise an Exception which gets caught here: https://github.com/graphite-project/carbon/blob/8e94e48f1d7605857b4e8ff3f5ac3646d55fd3f8/lib/carbon/cache.py#L42 So it would ignore the metric and log an error. I think that's probably what we want

replay commented 5 years ago

I've pushed a few more tests, so the test coverage report is happy

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.