prometheus / client_ruby

Prometheus instrumentation library for Ruby applications
Apache License 2.0
522 stars 149 forks source link

UTF-8: Implement support in Ruby client library #306

Open ywwg opened 8 months ago

ywwg commented 8 months ago

As in https://github.com/prometheus/client_golang/issues/1369 and https://github.com/prometheus/client_java/issues/916, the Ruby client library needs to be updated to support UTF-8.

Tasks:

For background and references see https://github.com/prometheus/prometheus/issues/13095

Sinjo commented 8 months ago

Hey, thanks for the heads up. I've got a rough idea in my head of how those changes will fit into the codebase.

I've given the proposal a read and one thing that stuck out to me is that grouping key labels in the pushgateway client's code aren't mentioned. Are they sticking to the old rules for now?

beorn7 commented 8 months ago

Very good point. We haven't thought about PGW yet. That needs more code changes, and some more ideas. sigh

beorn7 commented 8 months ago

For now, I would assume the PGW doesn't support the new UTF-8 names yet.

ywwg commented 8 months ago

We are making a note to create a design for this situation

Sinjo commented 8 months ago

That needs more code changes, and some more ideas. sigh

Happy to be of service 🙈

ywwg commented 2 months ago

@Sinjo are you still interested in working on this? We have done work on the pushgateway so that design is resolved.

Sinjo commented 2 months ago

Yeah, I'd like to make some time for it. It doesn't seem like it'll be too hard to bake in at least some experimental support for it.

The biggest potential snag when it comes to our in-process storage of metrics is going to be the way DirectFileStrore serialises the metrics to disk to enable multi-process web servers to have all their processes' metrics scraped. I'll need to make sure that code handles UTF-8 label names.

One thing I haven't managed to piece together is how the content negotiation works. Right now we exclusively serve text/plain; version=0.0.4 format with no conditional behaviours.

Having had a quick glance at https://github.com/prometheus/common/pull/570 it seems that looking for an extra escaping=allow-utf-8 value in the Accept header is enough to enable the new path with unescaped UTF-8 and that it's also possible to pass underscores, dots, or values to that parameter to use one of those escaping schemes. What isn't clear to me is what to do if the registry contains labels with UTF-8 characters, but no escaping parameter is passed. Is there a default escaping scheme that's been chosen for compatibility with existing Prometheus servers?

My gut says to stick all of this behind a config flag that makes it clear that all of it is experimental and could change in any minor/patch version. Does that sound right to you?

ywwg commented 2 months ago

If no escaping parameter is passed, the metrics producer is expected to apply a default escaping scheme. In Go, that is determined by the NameEscapingScheme setting in common/model. That is Underscores by default but a user could change it with a configuration value or flag. (I think we haven't implemented this in Go, actually, so I'll file an issue for that :)). If the metrics producer doesn't escape names, then the metrics will simply be rejected by the metrics consumer as having invalid metric or label names.

UTF-8 will be default-on in Prometheus 3 (and is on by default in the beta) and we should aim for that in all the client libraries. But yes it's acceptable to have this behind a flag for now.

Sinjo commented 2 months ago

Cool, that makes sense to me. Thanks!

ywwg commented 2 months ago

thank you for contributing! I know only enough ruby to be dangerous so I appreciate the help

ywwg commented 1 month ago

@Sinjo any progress?

Sinjo commented 1 month ago

Afraid not. My free time has been pretty heavily spoken for this last month, with a chunk of evenings/weekends going towards SREcon talk prep. Hoping to find some time to make progress soon.

ywwg commented 1 month ago

no worries, unassigning