prometheus / client_python

Prometheus instrumentation library for Python applications
Apache License 2.0
3.94k stars 795 forks source link

Default namespace/prefix for all exported metrics? #712

Open orfins opened 2 years ago

orfins commented 2 years ago

Is there a way to define a default, global prefix for all metrics?

For example instead of: python_gc_objects_collected_total

I'd like to add the prefix myproject_: myproject_python_gc_objects_collected_total

I'm currently doing it like that:

_original_generate_latest = prometheus_client.openmetrics.exposition.generate_latest

def init_prometheus(namespace):
    def new_generate_latest(*args, **kwargs):
        original_output = _original_generate_latest(*args, **kwargs).decode('utf-8')
        namespace_prefix = namespace + '_'
        new_output = ''

        for line in original_output.splitlines():
            if not line.startswith('#') and not line.startswith(namespace_prefix):
                new_output += namespace_prefix

            new_output += line + '\n'

        new_output = new_output.encode('utf-8')

        return new_output

    prometheus_client.generate_latest = new_generate_latest

But it doesn't always work. For example, prometheus_client.start_http_server ignores this function for whatever reason.

csmarchbanks commented 2 years ago

I apologize, this issue got lost during some time off.

I do not believe this is supported in this library right now, though you could use metric relabeling within Prometheus to accomplish this. I would be open to a PR to add this behavior. The best way would probably be to create a new type of registry that wraps a registry and adds the prefix.

SuperQ commented 2 years ago

This is generally not a good idea and an anti-pattern in Prometheus. It pollutes the metric namespace. Metric names are designed to be the same in order to represent the same values across different codebases.

orfins commented 2 years ago

@SuperQ

This is generally not a good idea and an anti-pattern in Prometheus. It pollutes the metric namespace. Metric names are designed to be the same in order to represent the same values across different codebases.

Are you saying that I should favor labels over namespaces? If that's what you mean then I think my understanding of namespaces is lacking.

According to this page: https://prometheus.io/docs/practices/naming/

A metric name... ... should have a (single-word) application prefix relevant to the domain the metric belongs to.

And what I'm trying to do is prefix each label according to the service it belongs to. For example: {service}_http_requests

Is that the wrong approach?

csmarchbanks commented 2 years ago

For process level metrics such as python_gc_objects_collected_total labels are preferred as @SuperQ mentions. To my knowledge {service}_http_request type metrics are very common and where I think this feature could be helpful. For example, client golang has WrapRegistererWithPrefix, do you know if there is any agreement where we want/do not want that function in all client libraries?

SuperQ commented 2 years ago

@csmarchbanks It's common, but it's still an anti-pattern. The difficulty isn't as interesting for a lot of users. But when you get to larger scales, it matters a lot.

We should have this overall discussion on the mailing list or the next developers summit.

To give a couple of concrete examples. An organization I'm familiar with is coming from a statsd background where they use the first keyword to namespace on their microservices. They continued this into their global TSDB deployment and Grafana. They're now suffering badly because they've generated somewhere on the order of 75k to 100k metric names. This means that metric name typing completion is slow to the point of being unusuible.

My organization is doing a similar transition. We have over 200 services. If every service namespaced all metrics, we'd probably have similar problems.

csmarchbanks commented 2 years ago

Thank you for the examples, it is an interesting discussion and I have added it to the agenda for a future dev summit. I agree that we should advise against prefixing process_/python_ metrics (similar to the go documentation) even if we do add a prefix register for subsystems.

SuperQ commented 2 years ago

@orfins Yes, the approach is to use labels to identify the service. This allows you to identify the same data across multiple services.

A metric name... ... should have a (single-word) application prefix relevant to the domain the metric belongs to.

Let's use your example, {service}_http_requests the domain here is actually http, since that's the "domain" of the metric. All services written serving http requests would have http_requests_total, since within the domain of serving HTTP, all requests are functionally identical, therefore the metric name should be identical. You would append your service as a label, like http_requests_total{service="my nice service"}. This has a nuber of useful properties, for example, multiple services of similar design can share the same dashboards and alerts. In tools like Grafana, you can assign $service as a variable and allow the user to select which service(s) they want to display.

Tander commented 10 months ago

Okay, but still, is there a way to add labels for all exported metrics? Including default ones, like python_gc_objects_collected_total etc?

SuperQ commented 10 months ago

@Tander Yes, that is done in the Prometheus config / service discovery.