falconry / falcon

The no-magic web data plane API and microservices framework for Python developers, with a focus on reliability, correctness, and performance at scale.
https://falcon.readthedocs.io/en/stable/
Apache License 2.0
9.51k stars 937 forks source link

feat: provide a .app attribute for generic middleware #1960

Open adriangb opened 2 years ago

adriangb commented 2 years ago

Other frameworks do this. Basically you add a level of indirection to __call__:

class App:
    def __init__(...):  self.app = self.default_call
    def default_call(...):
        # contents of current __call__
        ...
    def __call__(...):
        self.app(...)

This provides support for generic WSGI/ASGI middleware since users can just app.app = Middlware(app.app) as many times as they want, but still keep the rest of the falcon.App API.

vytas7 commented 2 years ago

Hi @adriangb ! Normally, Falcon is known to have a disdain for multiple levels of indirection. Could you provide a more elaborate example what usage scenarios this pattern would enable that are not possible using the classical WSGI/ASGI composition of callables?

adriangb commented 2 years ago

Hi, thanks for the quick reply.

I'm thinking of a situation like the OpenTelemetry instrumentation.

Currently, it monkey patches falcon.App and replaces it with a subclass that wraps __call__. This obviously has drawbacks. There was discussion around adding the ability to wrap an instance (which resolves most of the problems), and that is how they do it for Flask.

Presumably, this could be implemented by just doing app._original_call = app.__call__;app.__call__ = wrapper_factory(app.__call__, ...) but I find I'd find it a bit more tasteful if there was a documented public attribute for that purpose (the App.app attribute I'm proposing here).

vytas7 commented 2 years ago

Aha, I see, let us think about this.

FWIW, this couldn't even be implemented just by doing app.__call__ = something etc, because in Python, an instance's __call__ is not used when calling the object; only the respective class's __call__ is.

OTOH, if this is strictly for instrumentation, let us also discuss alternative ways to achieve the same effect. Maybe we're a bit too paranoid about microbenchmarks and tuning efficiency, but I would be reluctant to add an extra function call for users that don't need that. Linking that issue as well: https://github.com/falconry/falcon/issues/1828

CaselIT commented 2 years ago

I don't think there is a way of implementing this without adding a function call, because replacing __call__ at the instance level does not work

adriangb commented 2 years ago

I thought that patching would work on the instance level, maybe with some MethodType magic. But I don't like that solution anyways, so moving on.

I do think adding a single method call won't do any harm, even for microbenchmarks. Falcon is pretty usable as is, but IMO worrying about that level of optimization while using Python is pointless.

I think this is valuable beyond instrumentation. There's plenty of useful ASGI and WSGI middlewares out there that this would enable, while having basically no performance impact on other users.

vytas7 commented 2 years ago

I think this is valuable beyond instrumentation. There's plenty of useful ASGI and WSGI middlewares out there that this would enable, while having basically no performance impact on other users.

Well, the canonical way of applying WSGI middleware is by having it to recursively wrap the provided callable, see also: PEP 3333 Middleware: Components that Play Both Sides; I'm still not sure why indirection is needed for that.

adriangb commented 2 years ago

If you do this (which I think is the "cannonical way"):

app = App()
app = Middleware(app)

Now you lost the refence to your Falcon App since app is the middleware, which could even be a closure or something.

Of course you could just keep 2 references, but that's confusing and completely unergonomic for a library (instrument_app(app: falcon.App) -> Tuple[falcon.App, WSGIApp]). It can also only be done once, without making it even more unergonomic. The proposal here on the other hand can be done infinitely.

vytas7 commented 2 years ago

FWIW, Starlette doesn't seem to do this either, although it does add itself to the ASGI scope (== WSGI environ) as app (but that only happens after the outer middleware is called; and it has an indirection via a middleware stack, but that's in a sense comparable to Falcon's middleware stack.

adriangb commented 2 years ago

Yes, but they do provide a public API for adding generic ASGI middleware, so it's largely unecessary

vytas7 commented 2 years ago

I see. I'd still like to discuss various ways of implementing this, including having support for an external middleware stack like Starlette does it, also, by providing WSGI/ASGI utility functions to compose applications, as well as having an optional subclass like InstrumentedApp or w/e for the instrumentation case.

This is also related (but not limited to) to the following issues:

So when considering alternative patterns to improve the instrumentation case, we should also have the above interactions in mind.

adriangb commented 2 years ago

I do think it makes sense to think big picture.

But this can serve as the basis for an external middleware stack: you can (later on) add a method that does the iterative wrapping for users similar to how Starlette does things.

I'm not sure I understand how it relates to submounts, that seems like an orthogonal feature to me, but I may be missing something

CaselIT commented 2 years ago

I do think adding a single method call won't do any harm, even for microbenchmarks. Falcon is pretty usable as is, but IMO worrying about that level of optimization while using Python is pointless.

I was just stating a consideration.

Yes, but they do provide a public API for adding generic ASGI middleware, so it's largely unecessary

I'm unfamiliar with starlette, you mean that you could just call app.add_middleware() on the application with the 3rd party middleware?

Maybe we could have something similar also for falcon, but I guess the biggest issue would be how to differentiate if from falcon middleware, aka naming things is hard

adriangb commented 2 years ago

I was just stating a consideration.

Understood, sorry if I was a bit harsh. It's a valid consideration.

you could just call app.add_middleware() on the application with the 3rd party middleware?

Yes. Their middleware is generic ASGI middleware. They provide a thin wrapper to allow middleware writers (users or other libs) to write against a request response API: https://github.com/encode/starlette/blob/master/starlette/middleware/base.py

So add_middleware only ever accepts a generic ASGI middleware

vytas7 commented 2 years ago

(Semi off-topic)

I'm not sure I understand how it relates to submounts, that seems like an orthogonal feature to me, but I may be missing something

I meant that "mounting" can also be seen as composition of two or more WSGI (or ASGI) apps based on the URI, path, etc, not dissimilar to the WSGI recipe here: How do I split requests between my original app and the part I migrated to Falcon?

CaselIT commented 2 years ago

Maybe something like this, similar to starlette but requires replacing the app instance:

diff --git a/falcon/app.py b/falcon/app.py
index 193eed63..ae9aefaf 100644
--- a/falcon/app.py
+++ b/falcon/app.py
@@ -870,6 +870,11 @@ class App:

         self._serialize_error = serializer

+    def wrap_middleware(self, middleware_class, *arg, **kw):
+        middleware = middleware_class(self, *arg, **kw)
+        assert callable(middleware)
+        return ProxyApp(self, middleware)
+
     # ------------------------------------------------------------------------
     # Helpers that require self
     # ------------------------------------------------------------------------
@@ -1116,3 +1121,35 @@ class API(App):
     )
     def __init__(self, *args, **kwargs):
         super().__init__(*args, **kwargs)
+
+
+class ProxyApp:
+    __slots__ = ('_original_app', '_middleware_stack', '_call')
+
+    def __init__(self, original_app, initial_middleware):
+        self._original_app = original_app
+        self._middleware_stack = [initial_middleware]
+        self._call = initial_middleware
+
+    def __call__(self, env, start_response):
+        return self._call(env, start_response)
+
+    def wrap_middleware(self, middleware_class, *arg, **kw):
+        middleware = middleware_class(self._call, *arg, **kw)
+        assert callable(middleware)
+        self._middleware_stack.insert(9, middleware)
+        self._call = middleware
+        return self
+
+    def __getattr__(self, key):
+        if key in self.__slots__:
+            this = self
+        else:
+            this = self._original_app
+        return getattr(this, key)
+
+    def __setattr__(self, key, value):
+        if key in self.__slots__:
+            object.__setattr__(self, key, value)
+        else:
+            setattr(self._original_app, key, value)

example:

from dataclasses import dataclass
from falcon import App

class Res:
    def on_get(self, req, res):
        res.media = {'ok': True}

app = App()
app.add_route('/foo', Res())

@dataclass
class M:
    app: App
    name: str

    def __call__(self, env, start_response):
        print(f'{self.name} start')
        v = self.app(env, start_response)
        print(f'{self.name} done')
        return v

app = app.wrap_middleware(M, 'm1').wrap_middleware(M, 'm2')
app.add_route('/bar', Res())

the main issue is that it fails the isinstance call

adriangb commented 2 years ago

Yeah something like that. I may be missing something, but can't it be as simple as:

class App:
    def __init__(self) -> None:
        self.app = self.handle
    def handle(*args) -> None:
        # current __call__
    def __call__(*args) -> None
        self.app(*args)
    def add_middleware(self, middleware) -> None:
        self.app = middleware(self.app)
CaselIT commented 2 years ago

I may be missing something, but can't it be as simple as:

I was trying to provide a zero cost support when it's not used

adriangb commented 2 years ago

I was trying to provide a zero cost support when it's not used

Ah I see, what you have is pretty slick then. I guess performance (theoretically) would then be worse if the feature is used, but maybe that doesn't matter.

I still do feel that 1 extra method call is not going to move the needle on performance and I think it would be less surprising for users, type checkers & devs.

CaselIT commented 2 years ago

The main thinking was to avoid changing the current version that we know is working, more than performance.

I think it would be less surprising for users, type checkers & devs.

on this I agree, the proxing as proposed above is sub-optimal, it also requires replacing the app, but I guess that's not the works thing.

CaselIT commented 2 years ago

Note that the one above is just a 5 minutes test, it may well be that the use of a subcall is the best solution in the end