Closed Velan987 closed 2 years ago
We have some multiprocessing examples in https://github.com/rycus86/prometheus_flask_exporter/tree/master/examples but none with processes=N
I think. Do you know how is it implemented? Maybe the multiprocessing extension already works for it?
Yeah I tried that in my werkzeug application. Metrics endpoints are not exposed, getting 404.
OK, it was working if you manually called register_endpoint
on MultiprocessPrometheusMetrics
, but I added a new MultiprocessInternalPrometheusMetrics
as a shorthand to do that internally, and an example for it: https://github.com/rycus86/prometheus_flask_exporter/tree/master/examples/flask-multi-processes
It should be available in 0.19.0
as soon as https://github.com/rycus86/prometheus_flask_exporter/actions/runs/1950370778 finishes.
I tried above example in my local, when I hit any api separate Counter.db, gauge.db and histogram.db files are creating inside /tmp folder, but when I try to hit the metrics api I am getting 404 only. Can you please help me on this?
Do you have a minimal example to reproduce your issue? Also, have you checked the new example I added to this repo?
Yes I referred https://github.com/rycus86/prometheus_flask_exporter/tree/master/examples/flask-multi-processes this example. Below is the sample code.
from prometheus_flask_exporter.multiprocess import MultiprocessInternalPrometheusMetrics from flask import Flask, render_template, redirect, url_for, request, session, flash, jsonify, Response, make_response
os.environ["PROMETHEUS_MULTIPROC_DIR"]='/tmp'
app = Flask(name)
metrics = MultiprocessInternalPrometheusMetrics(app, path="/api/metrics")
@app.route("/api/v1/ping", methods=["GET"]) def ping(): """ Simple ping api for health check :return: """ ret = dict() ret['serviceName'] = 'test' ret['message'] = 'Healthy' return jsonify(ret)
if name == 'main': app.run(host='0.0.0.0', port=8080, processes=10, threaded=False)
This example seems to work for me as expected.
% curl -sv localhost:8080/api/v1/ping
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /api/v1/ping HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.77.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: application/json
< Content-Length: 43
< Server: Werkzeug/2.0.3 Python/3.9.10
< Date: Wed, 09 Mar 2022 05:04:14 GMT
<
{"message":"Healthy","serviceName":"test"}
And the metrics endpoint:
% curl -sv localhost:8080/api/metrics
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /api/metrics HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.77.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: text/plain; version=0.0.4; charset=utf-8
< Content-Length: 2145
< Server: Werkzeug/2.0.3 Python/3.9.10
< Date: Wed, 09 Mar 2022 05:04:29 GMT
<
# HELP flask_http_request_total Multiprocess metric
# TYPE flask_http_request_total counter
flask_http_request_total{method="GET",status="200"} 1.0
# HELP flask_exporter_info Multiprocess metric
# TYPE flask_exporter_info gauge
flask_exporter_info{version="0.19.0"} 1.0
# HELP flask_http_request_duration_seconds Multiprocess metric
# TYPE flask_http_request_duration_seconds histogram
flask_http_request_duration_seconds_sum{method="GET",path="/api/v1/ping",status="200"} 0.0010065219999972896
flask_http_request_duration_seconds_bucket{le="0.005",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.01",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.025",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.05",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.075",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.1",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.25",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.5",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.75",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="1.0",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="2.5",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="5.0",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="7.5",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="10.0",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="+Inf",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_count{method="GET",path="/api/v1/ping",status="200"} 1.0
What do you get when you try to access them?
Below is the response i am getting
GET /api/metrics HTTP/1.1 Host: localhost:8080 User-Agent: curl/7.64.0 Accept: /
The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.
my requirements.txt flask prometheus_client prometheus_flask_exporter
Are you just running your app with python3 app.py
or through some app server?
Using "python app.py" only
That's odd, seems to work on my machine. 😕 Can you try on a different port, or after a restart? (maybe something else is on that port?) Also, what OS are you on? I tried your example on OS X.
I am running this in docker, base image is using Debian GNU/Linux Release 10
When I cat one of the histogram file inside tmp folder I can see the metrics
/tmp# cat histogram_248.db �♂�["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_sum", {"method": "GET", "path": "/testreports/api/v1/ping", "status": "200"}] �,�<?�["flask _http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.005", "method": "GET", "path": "/testreports/api/v1/ping", "status": "200"}] �?�["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.01", "method": "GET", "path": "/testreports/api/v1/ping", "status": "200 "}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.025", "method": "GET", "path": "/testreports/api/v1/ping", "status": "2 00"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.05", "method": "GET", "path": "/testreports/api/v1/ping", "sta tus": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.075", "method": "GET", "path": "/testreports/api/v1/ping", "s tatus": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.1", "method": "GET", "path": "/testreports/api/v1/pi ng", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.25", "method": "GET", "path": "/testreports/api/v1/ ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.5", "method": "GET", "path": "/testreports/api/v1/ ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.75", "method": "GET", "path": "/testreports/api/v 1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "1.0", "method": "GET", "path": "/testreports/api/v 1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "2.5", "method": "GET", "path": "/testreports/api/ v1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "5.0", "method": "GET", "path": "/testreports/api /v1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "7.5", "method": "GET", "path": "/testreports/ap i/v1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "10.0", "method": "GET", "path": "/testreports/ api/v1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "+Inf", "method": "GET", "path": "/testreports /api/v1/ping", "status": "200"}]
OK I could reproduce it, I had previous config from testing that set the PROMETHEUS_MULTIPROC_DIR
environment variable, not through the main app like you have, and that seems to make it work.
Try running your app like PROMETHEUS_MULTIPROC_DIR=/tmp python3 app.py
- not exactly sure why it makes a difference for the collector though.
I removed PROMETHEUS_MULTIPROC_DIR environment variable from main app file and running the app like you mentioned PROMETHEUS_MULTIPROC_DIR=/tmp python app.py
While running getting below error ValueError: one of env PROMETHEUS_MULTIPROC_DIR or env prometheus_multiproc_dir must be set and be a directory
Hm, that's super odd. Can you try printing out the env variables in Python? Looks like somehow it doesn't see it? 😕
print(os.getenv("PROMETHEUS_MULTIPROC_DIR"))
Getting None
OK, so the error seems legit, now you just need to figure out how does your Python loses environment variables. :)
I added the environment variable in Dockerfile ENV PROMETHEUS_MULTIPROC_DIR=/tmp
When I running print(os.getenv("PROMETHEUS_MULTIPROC_DIR")) this statement I am getting /tmp
My application is up, ping api is working but getting 404 for metrics api.
Any chance you could push the example repo to GitHub, including the Dockerfile and relevant setup?
Do I need to invoke metrics.start_http_server() in main app file
Do I need to invoke metrics.start_http_server() in main app file
No, you don't need it with the internal multiprocess extensions.
Running multiple services in same port, nginx is taking care of external traffic. For a single processed application, I am getting response for /metrics. For a multiprocess application alone getting 404
I think an example repo including your Docker and Nginx setup would help here, I don't see right now what could be the issue.
Found the root cause, in my config file DEBUG is set to True, when I update it to False /metrics is working.
Now facing different issue: Metrics which has "flask_" prefix only exposed - eg:flask_http_request_duration_seconds_bucket Common metrics are not available - eg: "process_resident_memory_bytes, process_cpu_seconds_total", for single process service I am getting this metrics.
Can you please help me on this?
Found the root cause, in my config file DEBUG is set to True, when I update it to False /metrics is working.
Good find! FYI the README has a bit of info on Debug mode: https://github.com/rycus86/prometheus_flask_exporter#debug-mode
I'll answer the other question on the new issue you opened.
Do we support metrics for multiprocessed werkzeug server flask applicaiton? Is there an example how to set the metrics?
I am running the application like below app.run(host='0.0.0.0', port=8080, processes=10, threaded=False)