tornadoweb / tornado

Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed.
http://www.tornadoweb.org/
Apache License 2.0
21.77k stars 5.51k forks source link

Misleading docstring in RequestHandler.prepare() #3430

Closed pawciobiel closed 1 month ago

pawciobiel commented 1 month ago

I think RequestHandler.prepare() docstring is misleading because it doesn't mention that the method has to be in SUPPORTED_METHODS - otherwise prepare() won't be called.

I suggest we change:

        Override this method to perform common initialization regardless of the request method.

to:

        Override this method to perform common initialization for request method.                                                                                     
        If the request method is not in ``SUPPORTED_METHODS`` this will not be called. 
pawciobiel commented 1 month ago

https://github.com/tornadoweb/tornado/pull/3431

bdarnell commented 1 month ago

It's true that the docs don't match the behavior here, but it's also worth looking at whether the current behavior is what we want. Why exactly was this an issue for you? Was it because your on_finish was expecting prepare to have been called? That's a reasonable expectation; I've said before (in #2711 and #517) that the intended rule was "if prepare() is called, then exactly one of on_finish() or on_connection_close() will be called".

There are some exceptions to that rule today. SUPPORTED_METHODS is one, but errors from decode_argument and check_xsrf_cookie do the same thing. (and in the other direction, the fact that some errors call on_connection_close instead of on_finish can be confusing). Other errors short-circuit the process earlier, often in subtle ways: decoding the URL query string happens in two phases (first as bytes, and then as characters), and an error in the second phase will call on_error while an error in the first phase will not.

So some things we could do are:

My inclination is to not provide too many specifics about prepare and errors, and to only call on_finish (and on_connection_close) if prepare was called, but I'd like to hear more about what kind of problem you encountered that prompted this issue.

pawciobiel commented 1 month ago

Hello @bdarnell, thank you for prompt reply.

Was it because your on_finish was expecting prepare to have been called?

yes , precisely that. I have some additional metrics setup in prepare and I use it and do some cleanup in on_finish.

The code I work on has been written a while ago and changed by a number of people along the way and as a result the OO structure / inheritance of handlers is in a bad shape. Occasionally I need to restrict access to request methods in a child Handler. It's convenient to use SUPPORTED_METHODS for that.

Document when prepare() will and will not be called. I think that giving the full details of which errors are handled which way is too much minutiae and I'd rather just give a weaker guarantee like "if an HTTP method is called, prepare will be called before it. On errors, prepare is not guaranteed to be called".

I agree, documenting all the side effects is just impossible...

Call prepare in more cases, for example even when the SUPPORTED_METHODS check fails. This seems kind of silly to me since we know at this point that the request is going to fail, so why go through a prepare/on_finish cycle?

sure, it makes no sense

Call on_finish in fewer cases, so that it is called if and only if prepare has been called.

I know that I have code that really on on_finish to clean up things that are setup in initialize additionally to cleanup things that are setup in prepare. And in the past I had problems with memory leaks if things weren't clean up in on_finish (or on_finish was not called).

Another option could be to remove prepare completely (I guess doing all the setup in initialize may not be ideal for some cases...) or alternatively add another method - an antagonist to prepare (Less code is usually less problems and I don't seem to like any of antonyms proposed by thesaurus... ;-))

Thanks a lot for the clarification and all of your work on tornado!

bdarnell commented 1 month ago

I know that I have code that really on on_finish to clean up things that are setup in initialize additionally to cleanup things that are setup in prepare.

OK. Cleanup from initialize() is not something we have a solution to right now. initialize() is roughly equivalent to __init__(), so cleanup (if any) is handled by __del__() (with all its drawbacks). (Why does the initialize method exist? Just because it's annoying to remember the application, request arguments to __init__ and to be sure to call super(). I've been tempted to define def initialize(self, **kwargs): [self.setattr(k, v) for (k, v) in kwargs.items] and then discourage overriding it).

Another option could be to remove prepare completely (I guess doing all the setup in initialize may not be ideal for some cases...)

initialize() is not a substitute for prepare() because it happens too soon, before we can cleanly send a response. prepare can return a 403 error (for example), but any errors in initialize just get a generic error response. We're closer to removing initialize than prepare, I think (although I wouldn't actually remove either).

Our most complete documentation on this is at https://www.tornadoweb.org/en/stable/guide/structure.html#overriding-requesthandler-methods, which is probably not the best location, and it's also incomplete (this is what #2711 is about). But I think the best approach is to ensure that this page is complete and increase its visibility with links from all the relevant methods.

pawciobiel commented 1 month ago

ok, thanks again for the clarification.