Neoteroi / BlackSheep

Fast ASGI web framework for Python
https://www.neoteroi.dev/blacksheep/
MIT License
1.8k stars 75 forks source link

request.scope["root_path"] not set #490

Open ohait opened 3 months ago

ohait commented 3 months ago

Apologies if this is misleading or plain wrong, I'm still figuring out a lot about blacksheep. I am trying to generate some metrics on the requests coming to my service, and we use path templates like /get/:id. as of now, the prometheus middleware uses request.url.path.decode("utf-8") but given the above template, it will generate a lot of cardinality. Looking at ASGI docs, i expected request.scope["root_path"] to contains the original /get/:id path, instead of the url in the request, but it's always empty, while request.route_values is correctly set.

I tracked down the place where the routing take places here: https://github.com/Neoteroi/BlackSheep/blob/026897ff2796c2f2074cb368ff9c59a262dc28ad/blacksheep/baseapp.pyx#L78

is there a missing assignment for the request.scope? or am i completely off and there is an easier way?

ohait commented 2 months ago

we partially mitigate by checking the router again and using the matched route, but it's definitely suboptimal

RobertoPrevato commented 2 months ago

Hi @ohait BlackSheep does not alter the scope it receives from the ASGI server and never tries to set a root_path. AFAIK, the root_path in ASGI is thought for scenarios when we publish web applications behind proxies, when the proxy server directs requests under a specific "root path" towards a back-end that is not necessarily concerned with that URL fragment.

Sorry I read again your issue about cardinality and updated this comment. I need to think about this, how to log the path that caused the route match without looking for the matched route. I had the same issue in some of my closed source projects.

RobertoPrevato commented 2 months ago

Ok I found how I did it.

Disclaimer: I always avoid adding performance fees for specific situations, adding code that would execute for each web request in all cases, even when not needed - I follow an opt-in approach.

To keep track of the route that caused a match for logging purpose, I wrapped the get_match method of the application router, upon application start:

    def wrap_get_route_match(
        fn: Callable[[Request], Optional[RouteMatch]]
    ) -> Callable[[Request], Optional[RouteMatch]]:
        @wraps(fn)
        def get_route_match(request: Request) -> Optional[RouteMatch]:
            match = fn(request)
            request.route = match.pattern.decode()  if match else "Not Found"  # type: ignore
            return match

        return get_route_match

    app.router.get_match = wrap_get_route_match(app.router.get_match)  # type: ignore

And then organized logs by request.route.

If you don´t like wrapping methods in Python because you find it ugly, you can define a specific Router class and set it in the application.

from blacksheep import Application, Router
from blacksheep.messages import Request
from blacksheep.server.routing import RouteMatch

class TrackingRouter(Router):

    def get_match(self, request: Request) -> RouteMatch | None:
        match = super().get_match(request)
        request.route = match.pattern.decode() if match else "Not Found"  # type: ignore
        return match

app = Application(router=TrackingRouter())

@app.router.get("/*")
def home(request):
    return (
        f"Request path: {request.url.path.decode()}\n"
        + f"Request route path: {request.route}\n"
    )

PS. if you also find ugly to attach additional properties to the request object, I recommend to consider using a WeakKeyDictionary to store additional information about the request object, like in this example:

import weakref

from blacksheep import Application, Router
from blacksheep.messages import Request
from blacksheep.server.routing import RouteMatch

requests_routes = weakref.WeakKeyDictionary()

class TrackingRouter(Router):

    def get_match(self, request: Request) -> RouteMatch | None:
        match = super().get_match(request)
        requests_routes[request] = match.pattern.decode() if match else "Not Found"
        return match

app = Application(router=TrackingRouter())

@app.router.get("/*")
def home(request):
    return (
        f"Request path: {request.url.path.decode()}\n"
        + f"Request route path: {requests_routes[request]}\n"
    )

Additional care is needed if you use sub-routers.