rycus86 / prometheus_flask_exporter

Prometheus exporter for Flask applications
https://pypi.python.org/pypi/prometheus-flask-exporter
MIT License
642 stars 162 forks source link

PrometheusMetrics for werkzeug #121

Closed Velan987 closed 2 years ago

Velan987 commented 2 years ago

Do we support metrics for multiprocessed werkzeug server flask applicaiton? Is there an example how to set the metrics?

I am running the application like below app.run(host='0.0.0.0', port=8080, processes=10, threaded=False)

rycus86 commented 2 years ago

We have some multiprocessing examples in https://github.com/rycus86/prometheus_flask_exporter/tree/master/examples but none with processes=N I think. Do you know how is it implemented? Maybe the multiprocessing extension already works for it?

Velan987 commented 2 years ago

Yeah I tried that in my werkzeug application. Metrics endpoints are not exposed, getting 404.

rycus86 commented 2 years ago

OK, it was working if you manually called register_endpoint on MultiprocessPrometheusMetrics, but I added a new MultiprocessInternalPrometheusMetrics as a shorthand to do that internally, and an example for it: https://github.com/rycus86/prometheus_flask_exporter/tree/master/examples/flask-multi-processes

rycus86 commented 2 years ago

It should be available in 0.19.0 as soon as https://github.com/rycus86/prometheus_flask_exporter/actions/runs/1950370778 finishes.

Velan987 commented 2 years ago

I tried above example in my local, when I hit any api separate Counter.db, gauge.db and histogram.db files are creating inside /tmp folder, but when I try to hit the metrics api I am getting 404 only. Can you please help me on this?

rycus86 commented 2 years ago

Do you have a minimal example to reproduce your issue? Also, have you checked the new example I added to this repo?

Velan987 commented 2 years ago

Yes I referred https://github.com/rycus86/prometheus_flask_exporter/tree/master/examples/flask-multi-processes this example. Below is the sample code.

from prometheus_flask_exporter.multiprocess import MultiprocessInternalPrometheusMetrics from flask import Flask, render_template, redirect, url_for, request, session, flash, jsonify, Response, make_response

os.environ["PROMETHEUS_MULTIPROC_DIR"]='/tmp'

app = Flask(name)

metrics = MultiprocessInternalPrometheusMetrics(app, path="/api/metrics")

@app.route("/api/v1/ping", methods=["GET"]) def ping(): """ Simple ping api for health check :return: """ ret = dict() ret['serviceName'] = 'test' ret['message'] = 'Healthy' return jsonify(ret)

if name == 'main': app.run(host='0.0.0.0', port=8080, processes=10, threaded=False)

rycus86 commented 2 years ago

This example seems to work for me as expected.

% curl -sv localhost:8080/api/v1/ping
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /api/v1/ping HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.77.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: application/json
< Content-Length: 43
< Server: Werkzeug/2.0.3 Python/3.9.10
< Date: Wed, 09 Mar 2022 05:04:14 GMT
<
{"message":"Healthy","serviceName":"test"}

And the metrics endpoint:

% curl -sv localhost:8080/api/metrics
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /api/metrics HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.77.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Content-Type: text/plain; version=0.0.4; charset=utf-8
< Content-Length: 2145
< Server: Werkzeug/2.0.3 Python/3.9.10
< Date: Wed, 09 Mar 2022 05:04:29 GMT
<
# HELP flask_http_request_total Multiprocess metric
# TYPE flask_http_request_total counter
flask_http_request_total{method="GET",status="200"} 1.0
# HELP flask_exporter_info Multiprocess metric
# TYPE flask_exporter_info gauge
flask_exporter_info{version="0.19.0"} 1.0
# HELP flask_http_request_duration_seconds Multiprocess metric
# TYPE flask_http_request_duration_seconds histogram
flask_http_request_duration_seconds_sum{method="GET",path="/api/v1/ping",status="200"} 0.0010065219999972896
flask_http_request_duration_seconds_bucket{le="0.005",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.01",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.025",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.05",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.075",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.1",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.25",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.5",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.75",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="1.0",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="2.5",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="5.0",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="7.5",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="10.0",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="+Inf",method="GET",path="/api/v1/ping",status="200"} 1.0
flask_http_request_duration_seconds_count{method="GET",path="/api/v1/ping",status="200"} 1.0

What do you get when you try to access them?

Velan987 commented 2 years ago

Below is the response i am getting

my requirements.txt flask prometheus_client prometheus_flask_exporter

rycus86 commented 2 years ago

Are you just running your app with python3 app.py or through some app server?

Velan987 commented 2 years ago

Using "python app.py" only

rycus86 commented 2 years ago

That's odd, seems to work on my machine. 😕 Can you try on a different port, or after a restart? (maybe something else is on that port?) Also, what OS are you on? I tried your example on OS X.

Velan987 commented 2 years ago

I am running this in docker, base image is using Debian GNU/Linux Release 10

Velan987 commented 2 years ago

When I cat one of the histogram file inside tmp folder I can see the metrics

/tmp# cat histogram_248.db �♂�["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_sum", {"method": "GET", "path": "/testreports/api/v1/ping", "status": "200"}] �,�<?�["flask _http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.005", "method": "GET", "path": "/testreports/api/v1/ping", "status": "200"}] �?�["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.01", "method": "GET", "path": "/testreports/api/v1/ping", "status": "200 "}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.025", "method": "GET", "path": "/testreports/api/v1/ping", "status": "2 00"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.05", "method": "GET", "path": "/testreports/api/v1/ping", "sta tus": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.075", "method": "GET", "path": "/testreports/api/v1/ping", "s tatus": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.1", "method": "GET", "path": "/testreports/api/v1/pi ng", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.25", "method": "GET", "path": "/testreports/api/v1/ ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.5", "method": "GET", "path": "/testreports/api/v1/ ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "0.75", "method": "GET", "path": "/testreports/api/v 1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "1.0", "method": "GET", "path": "/testreports/api/v 1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "2.5", "method": "GET", "path": "/testreports/api/ v1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "5.0", "method": "GET", "path": "/testreports/api /v1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "7.5", "method": "GET", "path": "/testreports/ap i/v1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "10.0", "method": "GET", "path": "/testreports/ api/v1/ping", "status": "200"}] �["flask_http_request_duration_seconds", "flask_http_request_duration_seconds_bucket", {"le": "+Inf", "method": "GET", "path": "/testreports /api/v1/ping", "status": "200"}]

rycus86 commented 2 years ago

OK I could reproduce it, I had previous config from testing that set the PROMETHEUS_MULTIPROC_DIR environment variable, not through the main app like you have, and that seems to make it work. Try running your app like PROMETHEUS_MULTIPROC_DIR=/tmp python3 app.py - not exactly sure why it makes a difference for the collector though.

Velan987 commented 2 years ago

I removed PROMETHEUS_MULTIPROC_DIR environment variable from main app file and running the app like you mentioned PROMETHEUS_MULTIPROC_DIR=/tmp python app.py

While running getting below error ValueError: one of env PROMETHEUS_MULTIPROC_DIR or env prometheus_multiproc_dir must be set and be a directory

rycus86 commented 2 years ago

Hm, that's super odd. Can you try printing out the env variables in Python? Looks like somehow it doesn't see it? 😕

Velan987 commented 2 years ago

print(os.getenv("PROMETHEUS_MULTIPROC_DIR"))

Getting None

rycus86 commented 2 years ago

OK, so the error seems legit, now you just need to figure out how does your Python loses environment variables. :)

Velan987 commented 2 years ago

I added the environment variable in Dockerfile ENV PROMETHEUS_MULTIPROC_DIR=/tmp

When I running print(os.getenv("PROMETHEUS_MULTIPROC_DIR")) this statement I am getting /tmp

My application is up, ping api is working but getting 404 for metrics api.

rycus86 commented 2 years ago

Any chance you could push the example repo to GitHub, including the Dockerfile and relevant setup?

Velan987 commented 2 years ago

Do I need to invoke metrics.start_http_server() in main app file

rycus86 commented 2 years ago

Do I need to invoke metrics.start_http_server() in main app file

No, you don't need it with the internal multiprocess extensions.

Velan987 commented 2 years ago

Running multiple services in same port, nginx is taking care of external traffic. For a single processed application, I am getting response for /metrics. For a multiprocess application alone getting 404

rycus86 commented 2 years ago

I think an example repo including your Docker and Nginx setup would help here, I don't see right now what could be the issue.

Velan987 commented 2 years ago

Found the root cause, in my config file DEBUG is set to True, when I update it to False /metrics is working.

Now facing different issue: Metrics which has "flask_" prefix only exposed - eg:flask_http_request_duration_seconds_bucket Common metrics are not available - eg: "process_resident_memory_bytes, process_cpu_seconds_total", for single process service I am getting this metrics.

Can you please help me on this?

rycus86 commented 2 years ago

Found the root cause, in my config file DEBUG is set to True, when I update it to False /metrics is working.

Good find! FYI the README has a bit of info on Debug mode: https://github.com/rycus86/prometheus_flask_exporter#debug-mode

I'll answer the other question on the new issue you opened.