Open phillipuniverse opened 2 years ago
Slightly more concrete use case:
application = FastAPI(
title="My API"
)
async def unhandled_exception_handler(request: Request, exc: Exception) -> JSONResponse:
logger.exception("Unhandled exception")
return JSONResponse(
status_code=HTTP_500_INTERNAL_SERVER_ERROR,
content=jsonable_encoder({"title": "Unhandled error"),
)
application.add_exception_handler(Exception, unhandled_exception_handler)
In this case the add_exception_handler
is hooked up as the error handler for ServerErrorMiddleware
. Because of this and because the OpenTelemetry middleware is installed as "user middleware", the logger.exception
invocation within the unhandled_exception_handler
will not have a trace id or span id as the span has already ended.
The workaround for now is to add an opentelemetry_pre_instrument
entrypoint. Here is my patching function:
import opentelemetry
import typing
from starlette.types import ASGIApp
from opentelemetry.instrumentation.asgi import OpenTelemetryMiddleware
from opentelemetry.instrumentation.fastapi import _InstrumentedFastAPI
from opentelemetry.instrumentation.fastapi import _get_route_details
def patch_middleware_ordering():
"""
Replaces the default OpenTelemetry instrumentation class with one that is assured to always be added to the very
beginning of the middleware stack, even before ServerMiddleware. Necessary until https://github.com/open-telemetry/opentelemetry-python-contrib/issues/795
is resolved
"""
class OpenTelemetryFirstMiddlewareFastApi(_InstrumentedFastAPI):
def add_middleware(self, middleware_class: type, **options: typing.Any) -> None:
if middleware_class == OpenTelemetryMiddleware:
return
else:
super().add_middleware(middleware_class, **options)
def build_middleware_stack(self) -> ASGIApp:
app = super().build_middleware_stack()
return OpenTelemetryMiddleware(
app,
excluded_urls=_InstrumentedFastAPI._excluded_urls,
default_span_details=_get_route_details,
server_request_hook=_InstrumentedFastAPI._server_request_hook,
client_request_hook=_InstrumentedFastAPI._client_request_hook,
client_response_hook=_InstrumentedFastAPI._client_response_hook,
tracer_provider=_InstrumentedFastAPI._tracer_provider,
)
opentelemetry.instrumentation.fastapi._InstrumentedFastAPI = OpenTelemetryFirstMiddlewareFastApi
Then you need an entrypoint for fastapi
under opentelemtry_pre_instrument
. In my case I am using Poetry so my pyproject.toml
looks like this:
[tool.poetry.plugins."opentelemetry_pre_instrument"]
fastapi = "mylibrary.tracing.contrib.fastapi:patch_middleware_ordering"
With this change I am able to see trace/span ids in logger invocations from ServerErrorMiddleware
.
I would like to contribute this back but the problem is I don't think this will work for the non-auto-instrument case which does add_middleware
:
This whole process is a little shady anyway. It seems a little strange to me that Starlette/FastAPI does not give first-class support to fully control the final middleware ordering.
Also let me just say, I have been really impressed with all of the extension points that I have seen within this project. It was really nice to be able to self-serve and unblock myself in a not as nasty way. Great work!!
Hi, is it possible to monkey patch it without poetry, trying to do in multiple ways, but OpenTelemetryFirstMiddlewareFastApi class is not used :/
@lskrajny-marcura depends on how you are kicking off the instrumentation process. The poetry part above is not important per se, the important part is that you add the monkey patching as an entry_point.txt such that it runs prior to any default open telemetry instrumentation.
Have a simple app with main.py where i am first creating FastApi/Starlette app and then manually adding all opentelemetry instrumentations.
Having patching on top of main.py seems "too late"
@phillipuniverse Do you foresee any repercussions for adding the tracing middleware first? Would it conflict with other middlewares?
Do you foresee any repercussions for adding the tracing middleware first? Would it conflict with other middlewares?
@lzchen it has not in my experience. I have been running this patch in production since November 9, 2021 with no issues.
I did have to make one change in FastAPI 0.92/Starlette 0.24 with the change to lazily-instantiated middleware but it was pretty straightforward (I'll update my answer above to include the change, it's not there currently).
In my scenario this change is crucial for us to be able to be able to annotate unhandled exceptions in our application with a trace id in our external error reporting system.
FWIW I would consider the main problem to be a FastAPI problem where it limits full control over the entire middleware list.
@phillipuniverse Makes sense to me. Would you consider creating a PR for this?
No problem!
Hi, this is still an issue even in fastapi==0.103.2
.
I am using both app.add_middleware(MyCustomMW)
and custom exception handlers
Here are my debugging steps, if it will help anyone understand the problem better
In your asgi instrumentation's __init__.py
I added two print statements to show exactly when the span starts and whether it is recording (i.e. in the with ..use_span
and if ..is_recording
)
With auto instrumentation I get these logs
[request begins]
------- MyCustomMW: current span = None
---- OTelMW: started span
---- OTelMW: recording
[other logs]
------- MyCustomMW: cleanup step, span = None
---- OTelMW: finally clause reached
And after using a patch similar to the one posted earlier in this issue:
otel_mw = find_otel_mw()
cls, options = otel_mw
app.user_middleware.remove(otel_mw)
stack = app.build_middleware(stack)
app.middleware_stack = cls(app=stack, **options)
Then I get:
[request begins]
---- OTelMW: started span
---- OTelMW: recording
------- MyCustomMW: current span = _Span(name...)
[other logs]
------- MyCustomMW: cleanup step, span = _Span(name=...)
---- OTelMW: finally clause reached
And just to inspect the middleware stack creation, I added prints to the various __init__
functions:
original:
init OpenTelemetryInstrumentation middleware
init MyCustomMW middleware
patched:
init MyCustomMW middleware
init OpenTelemetryInstrumentation middleware
The
Starlette
application class (which is the parent of theFastAPI
application class) hardcodes theServerErrorMiddleware
to appear as the very first middleware. This means that if there is an error in some downstream middleware, handling the request withServerErrorMiddleware
will not have an active trace.Also, if you happen to do something like this:
The additional middleware added after the first
__init__
ofFastAPI
will also not have an active trace established.There is a relatively easy way around this to ensure that the OpenTelemetry middleware is always first:
FastAPI version - 0.70.0 Starlette version - 0.16.0 Python version - 3.9