prometheus / client_python

Prometheus instrumentation library for Python applications
Apache License 2.0
3.91k stars 793 forks source link

Support multiprocess mode without setting an environment variable #886

Open vladimir-avinkin opened 1 year ago

vladimir-avinkin commented 1 year ago

This is essentially a duplicate of issue #701

The main problem for my use case is that it's impossible to separate multiple instances of collectors, even though it has a path field misleading you into thinking that you can https://github.com/prometheus/client_python/blob/master/prometheus_client/multiprocess.py#L22

I am personally not sure about ways of fixing that, apart from just unifying the implementations and only supporting the multiprocess mode. Because running python as multiple processes is arguably a more popular way of doing things, Web servers, and async processing all pretty much run on multiprocessing and it has to be the largest chunk of software that wants to be instrumented.

And while mmapped files have overhead, it's not super big and python is not used for blazingly fast things in general, so IMO it's a reasonable tradeoff for the simplicity of maintaining a single mode of operation.

csmarchbanks commented 1 year ago

Similar to the other issue, I am open to ideas for supporting multiprocess mode without setting an environment variable. There are other drawbacks of multiprocess mode as well, such as not supporting custom collectors which is another common usage of this library.

ATofighi commented 1 year ago

If I understand the code correctly, there are two environment variables that prometheus_client depends on it: PROMETHEUS_DISABLE_CREATED_SERIES and PROMETHEUS_MULTIPROC_DIR.

Is it a bad idea if we can simply pass and customize use_created and value_class on metric's constructor?

An snippet of my thoughts:

from prometheus_client import multiprocess, values
from prometheus_client import generate_latest, CollectorRegistry, CONTENT_TYPE_LATEST, Counter

PROM_MULTIPROC_DIR1 = '/tmp/prom1/'

single_process_registry = CollectorRegistry()

MY_COUNTER1 = Counter('my_counter', 'Description of my counter', use_created=False, value_class=values.MultiProcessValue(path=PROM_MULTIPROC_DIR1))

MY_COUNTER2 = Counter('my_counter', 'Description of my counter', use_created=True, registry=single_process_registry)

# Expose metrics.
def app(environ, start_response):
    registry = CollectorRegistry()
    multiprocess.MultiProcessCollector(registry, path=PROM_MULTIPROC_DIR1)
    data = generate_latest(registry)
    status = '200 OK'
    response_headers = [
        ('Content-type', CONTENT_TYPE_LATEST),
        ('Content-Length', str(len(data)))
    ]
    start_response(status, response_headers)
    return iter([data])

# Expose metrics.
def app2(environ, start_response):
    data = generate_latest(single_process_registry)
    status = '200 OK'
    response_headers = [
        ('Content-type', CONTENT_TYPE_LATEST),
        ('Content-Length', str(len(data)))
    ]
    start_response(status, response_headers)
    return iter([data])

from prometheus_client import multiprocess

def child_exit(server, worker):
    multiprocess.mark_process_dead(worker.pid, path=PROM_MULTIPROC_DIR1)