open-telemetry / opentelemetry-python-contrib

OpenTelemetry instrumentation for Python modules
https://opentelemetry.io
Apache License 2.0
740 stars 614 forks source link

Add integration for Gunicorn #171

Open toumorokoshi opened 4 years ago

toumorokoshi commented 4 years ago

Starting a discussion around what integration for Gunicorn would look like, and also discuss if it's needed.

Today, we already have integrations with web frameworks directly (e.g. flask) as well as wsgi:

https://github.com/open-telemetry/opentelemetry-python/tree/master/ext

I presume that, if we support the major gateway interfaces (WSGI and ASGI), and finer grained support per framework, then everything should work fine with Gunicorn with support for ASGI in https://github.com/benoitc/gunicorn/issues/1380.

But wanted to get further thoughts.

Oberon00 commented 4 years ago

The main thing I can think of that gunicorn could provide us with, is the initial start time of the request in the WSGI environment. Then we can use that when starting the span to include the time it took to dispatch the request to the application.

disfluxly commented 4 years ago

I think this would be highly beneficial as well. It's a very common model to use Gunicorn -> Flask, and it's not entirely uncommon to see errors popping up when Gunicorn attempts to spin up the Flask application factory (especially for multi-process setups). Typically this would occur before the opentelemetry-ext-flask would get initiated, so having this integration would provide some great out-of-the-box insight.

hacst commented 4 years ago

Making gunicorn just work would definitely be great. Things I stumbled over so far:

ocervell commented 4 years ago

This would be beneficial indeed. Gunicorn is widely used to run Flask apps. Flask integration won't work with gunicorn since things like request_count would get a separate / process number and we'll have to aggregate across processes. Usually this result in deduplication of timeseries and increased cost.

Custom metrics A big problem right now is for counters across processes, it seems like the only way is to have 1 separate counter per process, but we would like 1 counter shared across all processes.

Any ideas for workarounds there ? Maybe if we use OT Collector and do the aggregation there we could achieve something like this ?

Framework metrics gunicorn only supports native instrumentation with statsd by passing the statsd- flags. If we can have OT as a statsd receiver that can forward metrics, we're solving that part. Currently I'm using a statsd-exporter to Prometheus but would like to have a native OT way to do this.

toumorokoshi commented 4 years ago

Framework metrics gunicorn only supports native instrumentation with statsd by passing the statsd- flags. If we can have OT as a statsd receiver that can forward metrics, we're solving that part. Currently I'm using a statsd-exporter to Prometheus but would like to have a native OT way to do this.

AFAIK metrics is still in a bit of flux, so waiting might be a good choice. Unless @lzchen has some thoughts there. I guess it is possible today to use the API exclusively to record measures, so if you're ok with some potential backwards-incompatible API changes down the road we could add them.

lzchen commented 4 years ago

@ocervell It would be great if you could bring this up as an issue in the metrics SIG on Thursdays.

ocervell commented 4 years ago

@lzchen i'm in EMEA so the meeting is very late for me. Could you please bring it up ? I think we can find a good implementation of this in the Flask Prometheus exporter which support gunicorn directly. See multiprocess section in their docs.

aabmass commented 4 years ago

I can talk about this in the meeting :)

lzchen commented 4 years ago

@ocervell No worries. Just an FYI, the meetings are different times alternating weekly, so if you are interested, I believe next week might be a more reasonable time for you.

jmacd commented 4 years ago

Sorry that we didn't get to this topic today. It presents some interesting questions--the idea of aggregating per-process metric events could definitely work, ideally this is something the collector could do. It's also possible to imagine repurposing the OTel-Python Metrics SDK to perform the aggregation itself, I think, but there are some questions this raises that I haven't worked through myself.

trallnag commented 4 years ago

@ocervell The Prometheus Flask Exporter you pointed does not implement multiprocess functionality itself. There are just a few wrappers for convenience. Still using Prometheus client library in the background

ocervell commented 4 years ago

Are there plans to support this in the roadmap yet ?

srikanthccv commented 3 years ago

uWSGI and Gunicorn are widely used for production deployment. It would be great to have them supported.

github-actions[bot] commented 3 years ago

This issue was marked stale due to lack of activity. It will be closed in 30 days.

tonycody commented 2 years ago
opentelemetry-instrument \
      --traces_exporter console \
      --service_name "test" \
      --logs_exporter console \
      gunicorn -c app/gunicorn.conf.py --preload app:app

When gunicorn is used, the tracing is broken

ocelotl commented 2 years ago
opentelemetry-instrument \
      --traces_exporter console \
      --service_name "test" \
      --logs_exporter console \
      gunicorn -c app/gunicorn.conf.py --preload app:app

When gunicorn is used, the tracing is broken

Can you give more context, please?

TBBle commented 2 years ago

Presumably @tonycody's issue is the tracer batch-processor lock across forks, i.e. https://opentelemetry-python.readthedocs.io/en/latest/examples/fork-process-model/README.html

This ticket seems to overlap with #676, as presumably whatever Gunicorn integration is done, it would also support auto-instrumentation? Or maybe not, if the initial integration is a one-liner to call in the post_fork hook (i.e. what https://github.com/open-telemetry/opentelemetry-python/pull/2046 was documenting, but with a non-internal API), and then #676 would be resolved once opentelemetry-instrument was able to set up that one-liner in the child process, and not attempt to instrument tracing in the parent process.

oldium commented 11 months ago

I tested this with Python 3.11, 4 Gunicorn workers, Uvicorn worker class (uvicorn.workers.UvicornWorker), FastAPI, and with traces output to console and I can see the traces even without post_fork hook. Actually, no specific modification was necessary. Maybe something changed? I used auto-instrumentation in Docker image with arguments CMD ["opentelemetry-instrument", "gunicorn", "my_app.app:app"] (number of workers set in gunicorn.conf.py) with the following requirements (in requirements.txt): opentelemetry-distro, opentelemetry-instrumentation-fastapi, opentelemetry-instrumentation-logging, uvicorn[standard], gunicorn, fastapi, pydantic.

hasland commented 11 months ago

@oldium Is it possible for you to share your code with the working gunicorn server? I'm trying to run it on k8s with auto-instrumentation but I'm getting a lot of errors or traces are never sent

oldium commented 11 months ago

@hasland I have created simple app at https://github.com/oldium/gunicorn-opentelemetry-test - it simulates how K8s operator works (with PYTHONPATH trick and dependencies provided by sidecar image). Works with console exporter. I also added opentelemetry-exporter-otlp-proto-http dependency, so it might work with otlp exporter and http/protobuf protocol (or short-hand otlp_proto_http exporter) (untested).