getsentry / sentry-python

The official Python SDK for Sentry.io
https://sentry.io/for/python/
MIT License
1.79k stars 472 forks source link

Performance powered by OTel #2251

Open sentrivana opened 11 months ago

sentrivana commented 11 months ago

Milestone: Performance Powered by OTel

Problem Statement

At the time of Sentry's initial performance product development, OpenTelemetry was in the nascent stages of its lifecycle and was not yet optimized for our requirements. Nevertheless, we maintained similarities in our data models and paradigms with OpenTelemetry. Since then, OpenTelemetry has significantly matured, passed the test of time, and has been generally available (GA) for over a year. It now boasts an extensive ecosystem of integrations spanning multiple technologies, including databases, queues, and protocols.

This maturity means that now is the time for us to rework our Python Performance Monitoring to use OTel under the hood. This way, we can leverage all the functionality from the OTel ecosystem, and overall better align with the broader ecosystem.

Key goals of this project

Non-goals

For the initial work, we want to strive to optimize for easy setup & usage. Exposing OTel internals & providing more hooks etc. for users to manually add OTel stuff may come at a later point. The public API of the SDK will remain the same for the prototype.

Misc

Internally, this project is known as POTel (from performance powered by OTel).

References

sentrivana commented 10 months ago

We need to check whether the following works (repurposed from https://github.com/getsentry/opentelemetry-demo/issues/12):

Product Features

What works out of the box? What doesn't? What works, but differently?

Note: See https://www.notion.so/sentry/SDKs-Performance-powered-by-OTel-POTEL-7f1900c5c1b04870bdc0f2cc8dd4d929?pvs=4#50f3fec4bef3402087ec035688329efa (internal link) for more in-depth description.

sentrivana commented 2 months ago

As we're looking into continuing with the project again, there's a few questions that have come up since we last worked on this.

sentrivana commented 2 months ago

Couple of things that came up while trying to test the feature gap. Documenting them here for future reference.

Getting Things to Install & Run

OTel provides a package for each instrumentation, so there's e.g. opentelemetry-instrumentation-flask, opentelemetry-instrumentation-django, etc. There's also a way to install all available instrumentation packages at once by installing opentelemetry-contrib-instrumentations. However, this is broken in recent versions due to packaging issues with one of the instrumentation packages involved. So if we want to use the latest versions of everything, we need to install all the individual packages manually.

And we do in principle want to install the newest versions of everything -- for instance, the psycopg instrumentation was only added in the last release.

Instrumenting

We have two basic options how to instrument programmatically.

  1. We call the instrument() method of each of the Instrumentors (e.g., FlaskInstrumentor, DjangoInstrumentor, etc.). This has the advantage of being able to pass in optional arguments to tweak the instrumentation (e.g., adding DB spans for Django).
  2. We use the OTel auto-discovery mechanism to figure out what should be instrumented. Unknown if there's a way to provide additional arguments to the individual instrumentation since by default, it'll call instrument() without any args.

Note that whatever way we choose, classes imported before sentry_sdk.init will still be left uninstrumented. The solution for that is to either require folks to import sentry_sdk and init it before anything else has even been imported, which is not great; or seeking out and post-patching the imported classes.

antonpirker commented 1 month ago

One idea to tackle the "classes imported before sentry_sdk.init will left uninstrumented":

What opentelementry-instrumentation does is setting the PYTHONPATH env variable to '.../lib/python3.10/site-packages/opentelemetry/instrumentation/auto_instrumentation:/home/anton/my_project' In .../lib/python3.10/site-packages/opentelemetry/instrumentation/auto_instrumentation there is a file called sitecustomize.py And when the Python interpreter starts it picks up this file and in this file the instrumentation is setup. This is a feature of Python (I did not know about): https://docs.python.org/3/library/site.html#module-site

Maybe we can do the same and put our own sentry_sdk.init in the sitecostumize.py? (Then we could people tell to set SENTRY_DSN env var and start their program and everything just works(tm).)