Open scornflake opened 5 years ago
I started using django-prometheus with an application running using uwsgi in multiple processes and I see strange things. At least some of the counters jump up and down on successive reloads (for example django_http_responses_before_middlewares_total
) which creates strange things in prometheus.
From my experience, this is a strong indicator that the requests for metrics get handled by different worker each time and that each of the workers sees (and reports) slightly different data :(
Is anybody else seeing this behavior? Is there a fix/workaround for this?
I found the documentation for dealing with uwsgi just a few minutes ago, so I guess it would solve my problem. Sorry for the previous post.
I found the documentation for dealing with uwsgi just a few minutes ago, so I guess it would solve my problem. Sorry for the previous post.
can you share the document link? i have the same problem.
https://pypi.org/project/prometheus-python/ look for "Monitor in multiprocess mode (uWSGI, Gunicorn)".
https://pypi.org/project/prometheus-python/ look for "Monitor in multiprocess mode (uWSGI, Gunicorn)".
thank you.
@beda42, that is a link to a separate package(prometheus-python). Were you able to solve the multiprocess issue using this package(django-prometheus)?
@beda42, that is a link to a separate package(prometheus-python). Were you able to solve the multiprocess issue using this package(django-prometheus)?
I think that maybe the document I linked to has changed slightly because there was a generic solution for anything based on prometheus_client before.
Anyway, the solution is based on setting the environment variable prometheus_multiproc_dir
. I use uwsgi, so this is what I placed into the uwsgi.ini file:
env = prometheus_multiproc_dir=/tmp/django_prometheus/
exec-pre-app = rm -rf /tmp/django_prometheus/ && mkdir /tmp/django_prometheus/
More info is available in the prometheus client documentation: https://github.com/prometheus/client_python/blob/master/README.md#multiprocess-mode-gunicorn
I also just ran into this issue ^^. Thanks for the link @beda42.
Later, I also found it mentioned in the documentation of the package.
Thus, I suggest that this issue can be closed.
I'm running a django app, using multi-process mod-wsgi (say, 4 processes, 4 threads). I have it setup using filepaths, and its outputting .db files just fine.
For counters however; I feel I misunderstand something.
From reading the code, it seems like all .db files are read. For counters, they are read and all samples are added to a single metric. Given that counter .db files are not deleted by "def mark_process_dead(pid, path=None):", doesn't that mean I'll end up with stale data in the metric?
From what I can tell, while reading, for counters, the following happens: metric.add_sample(name, labels_key, value)
A timestamp isn't passed in. So how do I know that when the results are accumulated I'm going to get back the latest value from the metric?
Also; is there a reason the counter.db files are not removed when the worker terminates? The only reason I can think of is that you probably want the data to hang around (I guess the process could be very short lived; perhaps only seconds or less). Which is fine. But then how is it expired?