PromPHP / prometheus_client_php

Prometheus instrumentation library for PHP applications
https://prometheus.io/docs/concepts/metric_types/
Apache License 2.0
423 stars 93 forks source link

Validare names for apcu #59

Open satmaelstorm opened 3 years ago

satmaelstorm commented 3 years ago

With APC adapter meta-information is broken, if use colons in names of metrics. It gives hard to debug error. I am add validation of names when use APC adapter

LKaemmerling commented 3 years ago

Hey @satmaelstorm,

thank you for your MR. Could you please add tests to check the code?

thedeacon commented 2 years ago

@LKaemmerling I'm unsure if this is the best place to open a conversation around this, but I'll start here and we can move it if needed. I agree that putting colons in some names can break things in the APC-based engines. I modified the blackbox tests (a quick hack/proof - the code is ugly) and was able to trigger runtime json_decode failures when attempting to read back a metric which was stored with a colon in its name.

I'm personally of the opinion that the storage layer should prevent users from corrupting the keyspace; either by rejecting invalid names, or by encoding them to a valid string before storing, and decoding on retrieval. Either approach would work, although the latter would result in a change to the storage format (possibly -- there might be a way to implement it that's backwards-compatible).

It also feels weird to have the Collector (an Abstract class) referring to a specific implementation (e.g. use Prometheus\Storage\APC), but I'm not sure how else the regex-selection logic could work, if it must be done in the Collector and not the storage layer. It seems an Abstract class should not refer to non-Abstract classes, but that's a soft opinion. We could have the engine itself either encode the string to a "safe" value and store that (and decode on retrieval), or apply a regex filter and throw a StorageException if the passed-in string is invalid -- but that requires changing all the update*() calls in the storage engine to throw exceptions, and duplicating the regex logic from the Collector class into the storage classes, which also feels ugly.

Do you have a preference on whether regex filtering logic should live in the engine versus in the Collector? Secondly, do you consider it a problem (or not) to use an Implementing class from inside an Abstract class (as an example, line 9 of this PR)?

Finally, I'm guessing that a non-breaking change ("filter and reject bad names") is preferred over a solution which would be non-backwards-compatible, such as encoding strings before storing them. Correct?