rycus86 / prometheus_flask_exporter

Prometheus exporter for Flask applications
https://pypi.python.org/pypi/prometheus-flask-exporter
MIT License
646 stars 161 forks source link

Warning: Silently ignoring app.run() because the application is run from the flask command line executable #135

Open raxod502 opened 2 years ago

raxod502 commented 2 years ago

An awkward situation. Code:

metrics = prometheus_flask_exporter.PrometheusMetrics(app)
metrics.start_http_server(8081, host=config.METRICS_HOST)

Result:

website_1   |  * Serving Flask app 'src.app:app' (lazy loading)
website_1   |  * Environment: development
website_1   |  * Debug mode: on
website_1   | /usr/local/lib/python3.8/dist-packages/prometheus_flask_exporter/__init__.py:322: Warning: Silently ignoring app.run() because the application is run from the flask command line executable. Consider putting app.run() behind an if __name__ == "__main__" guard to silence this warning.

And the metrics server does not listen on port 8081. It seems that Flask notices that I have invoked the main application using flask run, and has elected to disable the ability to run new Flask applications, which obviously breaks start_http_server from prometheus_flask_exporter.

Not sure what the best way to handle this is, but wanted to flag.

rycus86 commented 2 years ago

Hey, thanks for flagging this! Do you have a more complete (but minimal) example I could turn into an integration test to ensure this works as expected?

raxod502 commented 2 years ago

Sure, I will put something together and get back to you soon.

raxod502 commented 2 years ago

Here's a working example: https://paste.sr.ht/~raxod502/e7527b5868cdee04160f457870299ff33ce0c1ed

% poetry run flask run             
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000 (Press CTRL+C to quit)
/home/raxod502/.cache/pypoetry/virtualenvs/prometheus-flask-exporter-issue-135-_O3CvxEA-py3.10/lib/python3.10/site-packages/prometheus_flask_exporter/__init__.py:322: Warning: Silently ignoring app.run() because the application is run from the flask command line executable. Consider putting app.run() behind an if __name__ == "__main__" guard to silence this warning.
  app.run(host=host, port=port)
raxod502 commented 2 years ago

Now that I've had a moment to review the source, I can confirm that simply adding this code before invoking metrics.start_http_server works around the problem, and allows the metrics server to be started without issue:

os.environ.pop("FLASK_RUN_FROM_CLI")
raxod502 commented 2 years ago

However, when running in development mode (with the live reloader) we encounter an additional problem, because the metrics Flask app tries to start its own live reloader from inside the first live reloader, which does not work since the live reloader establishes signal handlers and therefore can only run from the main thread:

% FLASK_ENV=development poetry run flask run
 * Environment: development
 * Debug mode: on
 * Serving Flask app 'prometheus-flask-exporter-8081' (lazy loading)
 * Environment: development
 * Debug mode: on
 * Running on http://127.0.0.1:8081 (Press CTRL+C to quit)
 * Restarting with stat
Exception in thread Thread-1 (run_app):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 946, in run
    self._target(*self._args, **self._kwargs)
  File "/home/raxod502/.cache/pypoetry/virtualenvs/prometheus-flask-exporter-issue-135-_O3CvxEA-py3.10/lib/python3.10/site-packages/prometheus_flask_exporter/__init__.py", line 322, in run_app
    app.run(host=host, port=port)
  File "/home/raxod502/.cache/pypoetry/virtualenvs/prometheus-flask-exporter-issue-135-_O3CvxEA-py3.10/lib/python3.10/site-packages/flask/app.py", line 920, in run
    run_simple(t.cast(str, host), port, self, **options)
  File "/home/raxod502/.cache/pypoetry/virtualenvs/prometheus-flask-exporter-issue-135-_O3CvxEA-py3.10/lib/python3.10/site-packages/werkzeug/serving.py", line 1083, in run_simple
    run_with_reloader(
  File "/home/raxod502/.cache/pypoetry/virtualenvs/prometheus-flask-exporter-issue-135-_O3CvxEA-py3.10/lib/python3.10/site-packages/werkzeug/_reloader.py", line 427, in run_with_reloader
    signal.signal(signal.SIGTERM, lambda *args: sys.exit(0))
  File "/usr/lib/python3.10/signal.py", line 56, in signal
    handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
ValueError: signal only works in main thread of the main interpreter
 * Debugger is active!
 * Debugger PIN: 788-973-912
raxod502 commented 2 years ago

It appears to be possible to work around this issue by unsetting FLASK_ENV from Python as well, before starting the metrics server. However, this does have the disadvantage of producing relatively confusing log output:

% FLASK_ENV=development poetry run flask run
 * Environment: development
 * Debug mode: on
 * Serving Flask app 'prometheus-flask-exporter-8081' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:8081 (Press CTRL+C to quit)
 * Running on http://127.0.0.1:5000 (Press CTRL+C to quit)
 * Restarting with stat

Working code:

import os

import flask
import prometheus_flask_exporter

app = flask.Flask(__name__)

os.environ.pop("FLASK_RUN_FROM_CLI")
os.environ.pop("FLASK_ENV")

metrics = prometheus_flask_exporter.PrometheusMetrics(app)
metrics.start_http_server(8081, host="127.0.0.1")
rycus86 commented 2 years ago

Thanks a lot for the investigation @raxod502 ! 🙇 Do you have a proposed way forward based on that? I expect if we unset those environment variables then we'd interfere with other parts of Flask that would handle those?

raxod502 commented 2 years ago

Yeah... and environment variables are shared globally for the whole process, so messing with them is really nasty. Let's file an issue upstream: https://github.com/pallets/flask/issues/4616

If a keyword argument is added to app.run() as I suggest in that issue report, then we can set that keyword argument from prometheus_flask_exporter, and if we additionally set debug=False in prometheus_flask_exporter, that should fully resolve this problem.

davidism commented 2 years ago

This metrics server does not appear to be part of the application under development. It is its own, separate application. Therefore, it should be run using a production server, even locally. Production servers include Gunicorn, Waitress, uWSGI, mod_wsgi, Eventlet, and Gevent.

raxod502 commented 2 years ago

OK, sounds like the way to resolve this issue would be to modify prometheus_flask_exporter to use Gunicorn or an equivalent framework by default, since the official upstream recommendation is that prometheus_flask_exporter is not an appropriate use case for the Flask development server. Does that seem like an appropriate course of action to you, @rycus86?

rycus86 commented 2 years ago

I'll have a look at this more deeply when I get a chance. I wouldn't necessarily want Gunicorn or other dependencies to be mandatory, so I'll need to dig deeper into what am I doing wrong in the code that surfaces this problem (sounds like maybe I use the wrong app somewhere perhaps).

rycus86 commented 2 years ago

OK, so I had a chance now to look at the code. So flask run passes an environment variable to the app so that we don't actually start an app from within, because the cli machinery is going to set that development server up for us.

If this is intended for development mode only, I think it should be OK to just have the metrics endpoint exposed on the main Flask app, which happens by default.

import flask
import prometheus_flask_exporter

app = flask.Flask(__name__)
metrics = prometheus_flask_exporter.PrometheusMetrics(app)

@app.route('/test')
def index():
    return 'Hello world'

If you flask run this, then you'll get the app endpoints and the metrics endpoint exposed on the same app that belongs to the (single) development server, e.g.:

Does this work for you @raxod502 ? I think the best course of action from here is document this fact and/or make this library raise an exception when we're in flask run mode and someone calls the start_http_server that we (now) know won't work.

raxod502 commented 2 years ago

That works for development mode, but what about in production? It sounds like the Flask folks are saying it is not appropriate to use the Flask development server in production, yet it seems like prometheus_flask_exporter does not have an option to use anything other than the Flask development server.

What if we add an API to prometheus_flask_exporter that simply returns the Flask app for exposing metrics, so that the user may choose how they wish to serve that application (e.g. by adding a dependency on Gunicorn or a similar WGSI package)?

rycus86 commented 2 years ago

We do have examples on how to integrate with "production ready" servers, for example:

One thing to note: for the non-internal multiprocessing enabled Prometheus exporter wrappers, we use the prometheus_client.start_http_server function from the Prometheus client library, see https://github.com/rycus86/prometheus_flask_exporter/blob/master/prometheus_flask_exporter/multiprocess.py#L78

What if we add an API to prometheus_flask_exporter that simply returns the Flask app for exposing metrics, so that the user may choose how they wish to serve that application (e.g. by adding a dependency on Gunicorn or a similar WGSI package)?

We could perhaps expose the contents of https://github.com/rycus86/prometheus_flask_exporter/blob/master/prometheus_flask_exporter/__init__.py#L272 as a function that you can then call from an endpoint you supply in whatever way is appropriate - if that's desirable, we can look into that.

raxod502 commented 2 years ago

for the non-internal multiprocessing enabled Prometheus exporter wrappers, we use the prometheus_client.start_http_server function from the Prometheus client library

Ah, got it. That is indeed what I was looking for. I didn't read this part of the documentation at first because it was labeled for multiprocess-aware handling, which wasn't something I needed in my application. Should we migrate to using this function in all cases where the user is placing the metrics endpoint on a different port, so that we eliminate all use of the Flask development server except insofar as the user is running it themselves? I think that would resolve the issues that upstream has raised.

davidism commented 2 years ago

I just looked quickly at prometheus_client.start_http_server, and you should not use that either. It should be reported as an issue to prometheus_client, unless they've already indicated that that is for development only. It uses wsgiref.simple_server, which is based on http.server just like Werkzeug's development server is, so it is not secure and should not be used in production.

rycus86 commented 2 years ago

They already have different (more robust?) ways to expose metrics over HTTP, see https://github.com/prometheus/client_python#http

so it is not secure and should not be used in production

Maybe this is worth calling out in this project's README, and suggest some alternatives - though on that note I think if you run the metrics endpoint on an insecure server that is not available for external users, it may be something that is an acceptable tradeoff for some. I liked @raxod502 's suggestion to package up the metrics exposure code as a function so that people can bring their own (more secure) servers to serve it with, and perhaps here we can explain in the README that it's more recommended than the other alternative.

raxod502 commented 2 years ago

Yeah - although the development server is not "production ready", it may be less bad to use it for an internal metrics endpoint that will be seeing <0.1 request per second, and is not accessible to malicious outside traffic. But, if it's not production ready, we should still do our best to follow best practices and use something that is recommended instead.

rycus86 commented 2 years ago

OK, so I added PrometheusMetrics.generate_metrics() in https://github.com/rycus86/prometheus_flask_exporter/commit/3d29e0431975b010df04c789571500662ebb0fd8 but as I was about to write an integration test, I realized I'm not really sure how that's supposed to work. You'd only want to do this yourself if you want to expose the metrics endpoint on a different server or different HTTP port, and not sure what has support for this in the same Python process (couldn't find Gunicorn/uwsgi supporting this, maybe haven't looked too closely). If it's not the same Python process, would that even work? I suppose with multiprocessing writing data into PROMETHEUS_MULTIPROC_DIR directory, then the second webserver could maybe read stuff from there, but I'm not a 100% sure that works as expected. If we do think that's how it's supposed to work, I can look into adding an integration test somehow to replicate that, but I'm not absolutely convinced.

raxod502 commented 2 years ago

Yeah, I would assume that you would want to expose the second webserver from within the same process. Looking into the options in https://github.com/rycus86/prometheus_flask_exporter/issues/135#issuecomment-1143929020, it seems like Waitress, Eventlet, and Gevent are acceptable options.

Since the Flask project doesn't want to support usage of the Flask development server even for internal facing metrics, I don't think we have any option other than to use a pure-Python WSGI server implementation to produce metrics from in-process. Then of course in production we must also use the existing techniques in prometheus_flask_exporter to ensure metrics are shared between all worker processes.

Seems to me like it's either that or abandon the idea of serving metrics on a separate port.

reubano commented 2 years ago

You should alternatively be able to set app.run's use_reloader option to False like so, app.run(debug=True, use_reloader=False).

raxod502 commented 2 years ago

Well, sure, but only if we feel comfortable ignoring guidance from upstream that the Flask dev server (which is what gets run when you use app.run) should not be used in this situation - right?

reubano commented 2 years ago

My suggestion was an alternative to os.environ.pop("FLASK_ENV"). You should still use a Production server in production. But the Flask dev server is fine in development.

In my case, I'm doing the following

os.environ.pop("FLASK_RUN_FROM_CLI")
app.run(debug=False, use_reloader=False)
raxod502 commented 2 years ago

But the Flask dev server is fine in development

I agree with you, however, the comment from @davidism says this:

This metrics server does not appear to be part of the application under development. It is its own, separate application. Therefore, it should be run using a production server, even locally

Which I'm interpreting to mean that using the Flask development server for something that is not the main Flask app, even in development, is not a supported use case by upstream. I have to confess I am not sure why, but if that's what we have to work with, it's what we have to work with.

reubano commented 2 years ago

Gotcha. In that case, how hard would it be to make the metrics server "part of the application under development"? Maybe something like, Run Multiple Flask Applications from the Same Server. Granted, I haven't looked into this codebase to know how applicable that will be.

My own reason for encountering the error discussed in this issue is that I needed to spawn a temporary server on a random open port to check a user's authentication. This temp server can run independently of the Flask app itself. I.e., to run the service normally, flask run. And to kick off an authentication job (whether or not the normal service is running) flask auth.

Here's a trimmed-down version of my code for reference:

from flask import Flask
from signal import pthread_kill, SIGTSTP
from threading import Thread

import requests

app = Flask(__name__)

def find_free_port():
    # implementation removed for brevity
    return 5643

@app.cli.command("auth")
def auth(name):
    port = find_free_port()
    kwargs = {"port": port, "debug": False, "use_reloader": False}
    thread = Thread(target=app.run, kwargs=kwargs)
    thread.start()

    url = f"http://localhost:{port}/auth"
    r = requests.get(url)
    refresh_token = r.json()["token"].get("refresh_token"):
    print(f"refresh token is {refresh_token}")
    pthread_kill(thread.ident, SIGTSTP)
raxod502 commented 2 years ago

how hard would it be to make the metrics server "part of the application under development"?

I am not sure this is possible.

Maybe something like, Run Multiple Flask Applications from the Same Server

That is for doing routing within a single WSGI application to proxy to multiple sub-applications by URL path. What we want is to run multiple top-level WSGI applications separately.

As far as I can tell, what the Flask project is saying is you can't use ~Flask~ app.run for this.

yrro commented 1 year ago

how hard would it be to make the metrics server "part of the application under development"?

Does this help? https://github.com/yrro/hitron-exporter/blob/4add40236f52042c381343fb3a696a6dead6cab0/hitron_exporter/__init__.py#L41

(I'm investigating an issue whereby /metrics gives me a 404 error when I run my app via flask run and the above solution appears to work for me...)

[edit] to clarify, when I run flask routes I do see a route for /metrics even though actually accessing it gives me a 404 error.

rycus86 commented 1 year ago

I had a look at your code and checked it with a minimal example:

from flask import Flask
from prometheus_client import make_wsgi_app
from prometheus_flask_exporter import PrometheusMetrics

app = Flask(__name__)
metrics = PrometheusMetrics(app)
metrics_wsgi_app = make_wsgi_app()

@app.get('/info')
def info():
    import os
    return {'response': 'ok', 'env': dict(os.environ)}

@app.get('/metrics')
@metrics.do_not_track()
def metrics():
    return metrics_wsgi_app

if __name__ == '__main__':
    app.run('0.0.0.0', 5000, debug=True, use_reloader=True)

This works with flask run with any combinations of --debug and/or --reload being present. When I checked how to make it work by default, I noticed https://github.com/rycus86/prometheus_flask_exporter/issues/4 that has tried working the opposite way. 😅 I also checked that just having DEBUG_METRICS=true enabled has the same effect, we'll let Flask add the built-in metrics endpoint on the app under development (like your solution if I understand correctly).

edit: I've pushed the example code here: https://github.com/rycus86/prometheus_flask_exporter/tree/master/examples/flask-run-135