Open njsmith opened 6 years ago
An additional benefit here is that if people switch around from serve_tcp
, to serve_ssl_over_tcp
, to serve_listeners
, I imagine it's not uncommon to accidentally send the wrong parameters into the wrong function. Presumably when that happens now, you either get a too many/too few parameter exception or I'm assuming duck typing takes over and you get some weird error when you try to use an integer as a listener or something like that.
However, with mandatory kwargs, you get much better error messages, something like: TypeError: serve_listeners() got an unexpected keyword argument 'port'
instead. That seems like an additional win to me.
I agree that this change would be beneficial. … which reminds me that I should build a new release …
But I do think we can do a regular deprecation warning
By this, do you mean this?
1) first, make a release that issues a DeprecationWarning when serve_* are called without kwargs 2) at some point, make a release with only mandatory kwargs 3) only after that point add positional args
Or do you want to add positional args straight away, with the positional confusion, and in the same release document serve_* other args to be keyword only and issue DeprecationWarnings?
The ideal deprecation cycle is: there should be some period of time where the old thing works the same way it used to, and the new thing works the way it will work in the future, and the old thing gives a warning.
So in this case, we'd want to have some time where serve_tcp(handler, port)
works the same way it does now while issuing a warning, and serve_tcp(handler, port=port)
also works, but doesn't issue a warning. Implementing this is a little awkward, but definitely doable.
I guess we could add positional args straight away, with the proviso that they only work if port=
is also passed. So serve_tcp(handler, blah, port=80)
means serve on port 80 and then call handler(stream, blah)
, but serve_tcp(handler, blah)
issues a warning and then does serve_tcp(handler, port=blah)
. Something like:
_not_given = object()
async def serve_tcp(handler, *args, *, port=_not_given, ...):
if port is _not_given:
if len(args) != 1:
raise TypeError
warn_deprecated(...)
(port,) = args
args = ()
# Now we can act like we're in the future case
...
Thanks for the clarification! Makes perfect sense to me. I support the change.
Any other part of the API that still doesn’t use the same pattern? I don’t remember any.
Here's a mild annoyance that happens if we make port
and friends become mandatory kwargs instead of positional args – it becomes impossible to start a background server without using partial
:
# Before
await nursery.start(serve_tcp, handler, port)
# After
await nursery.start(partial(serve_tcp, handler, port=port))
This isn't a huge showstopper or anything, but I missed it before, and it is a bit annoying given how commonly these two operations are combined. And that the original reason @dstufft suggested making this change was that he wanted to avoid using partial
:-). In fact for those actually using the new *args
support, I guess the relevant comparison might be:
# Before
await nursery.start(serve_tcp, partial(handler, *handler_args), port)
# After
await nursery.start(partial(serve_tcp, handler, *handler_args, port=port))
FWIW I really really like keyword-only arguments, and would be happy to see them basically anywhere in Trio. They're harder to accidentally misuse, easier to compatibly change later, and clearer when reading calling code! If it wasn't for the whole "invalid on Python 2" thing, I'd use them for practically everything - IMO two positional arguments is a sensible limit 😄
I've also argued for more use of partial(...)
before, so that doesn't bother me and I find the after version more elegant and explicit in the second case.
Looking at this again, I noticed another piece of potential awkwardness: if we make it so serve_tcp(handler, *args)
ends up calling
await handler(stream, *args)
...then that means that serve_tcp(handler, *args)
will have a different meaning than serve_tcp(partial(handler, *args))
, because the latter will eventually run
await handler(*args, stream)
OTOH, @dstufft's point above is still pretty compelling – that if you have a wrapper, you can write async def handler_wrapper(stream, *args)
, but async def handler_wrapper(*args, stream)
does not do what you want.
Options to consider:
stream
first, *args
second: convenient for wrappers, inconsistent with partial
*args
first, stream
second: consistent with partial, but inconvenient for wrappersstream
as a kwarg: tempting! but... I've been wondering if we should make Listener
a generic class, so you can have Listener[Stream]
, Listener[Channel]
, Listener[WebSocket]
, etc. And it's hard to pick a single name that's going to feel natural across all of these. conn
? incoming
? peer
?stream
first, *args
second, AND if the fn
is a functools.partial
object, then inject the stream
at the front of the saved *args
: pretty abstraction-breaking*args
, make people use partial
if they want *args
OR **kwargs
: this is what we do now. It works.I remain very fond of partial
- it always works, doesn't break abstractions, and is back-, forward-, and sideways-compatible.
Couple of thoughts:
I want to do foo(port)
or foo(port=80)
. I prefer not to do foo(port=port)
. I would probably push back against foo(80)
in a code review.
When it's a variable named something clearly self-descriptive like port
, passing it positionally is on balance a little better - when it is a number, passing it keywordally is much better.
So if I had to sacrifice one to enforce the other, I would side with keyword-only.
But in practice, almost all code I've ever written that dealt with ports ends up with the port in a variable named port
or {{ something }}_port
. The only exceptions are off-the-cuff scripting situations and scripts so tiny that they just use argv
values directly without running them through argparse
(or docopt
, which is much nicer to use but not in the standard library).
So arguably making it keyword-only would make it both worse in the common case where the code is going to be read again, and worse in the common case where the code is not going to be read again.
Don't worry about "inconsistent with partial
", that's not as bad as it seems.
It's worth thinking about when you actually want to call a handler with a signature stream, *args
or *args, stream
. Usually your handler will have a signature stream, foo, bar
or foo, bar, stream
- the order is not a big deal in that case.
So my intuition is that most cases actually end up in code where all arguments can be individually pulled apart by name, and that the overwhelming majority of cases can be written that way.
About this:
# Before await nursery.start(serve_tcp, partial(handler, *handler_args), port) # After await nursery.start(partial(serve_tcp, handler, *handler_args, port=port))
Both of these take me enough mental effort to parse from a cold read (not warmed up to the patterns common in Trio or these functions specifically) that my reflex would be to push back against either.
Here's what I would want to do instead:
handler_ = partial(handler, *handler_args)
serve_tcp_ = partial(serve_tcp, handler_, port)
await nursery.start(serve_tcp_)
Yeah, it's more annoying to write. But it's very lightweight to read!
Yeah it is visually more stuff. But it is simpler stuff that requires less mental load and precise visual work to rigorously parse. Yeah it might feel more complex in that you have to parse and combine two partials in your mind. But that complexity is objectively there and we need to mentally drag it out of the other forms too to rigorously step through what exactly the code does in our mind. This form just forces it to be explicit.
There is a composable way that people can use to skip writing partial
at every call site!
def handler(...):
...
handler = partial(partial, handler)
# or to name the pattern:
def partialize(function):
return partial(partial, function)
# then you can use it as a decorator too
@partialize
def handler(...):
...
Then your example use case can look like this:
await nursery.start(serve_tcp, handler(*handler_args), port))
You can also use curry
from a library like functools
or funcy
, but this "partialize" thing does not require any dependencies, is a one/two -liner, partializing does not struggle with variadic functions like currying does, and does not require people to wrap their heads around currying.
Oh I totally missed replying to this one:
- Pass the stream first, args second, AND if the fn is a functools.partial object, then inject the stream at the front of the saved args: pretty abstraction-breaking
Understatement. You probably already see what I am about to say, but for anyone who doesn't:
This would break partial
in a profound way, completely violating what anyone used to partial application would expect.
I literally would never have thought to worry about a function doing this to me if I passed a partially applied function in, until I read this. Now I won't be able to sleep at night. "Nothing is certain... Nothing can be trusted..." I mutter as I sit curled up in the corner, rocking back and forth, terror in my wide and unblinking eyes.
If I ever hit a function like this in the wild I would immediately code my own implementation of partial
just so that I could always know no other code could be written to detect and modify my arguments out from under me.
This would be like if we overloaded a substraction operator so that a - b
actually computed b - a
!
By the way consider:
reverse/from-the-right partial application (see funcy.rpartial
; in other languages I have also usually seen it called partialRight
or the like), so the caller can just do:
handler_ = rpartial(handler, *handler_args)
serve_tcp_ = partial(serve_tcp, handler_, port)
await nursery.start(serve_tcp_)
composable ways of reversing/flipping the signature of a callable such as:
def flip(function):
def flipped(*args, **kwargs):
function(*reversed(args), **kwargs)
return flipped
handler_ = partial(flip(handler), *handler_args)
serve_tcp_ = partial(serve_tcp, handler_, port)
await nursery.start(serve_tcp_)
Notice the elegant separation of concerns: you just worry about calling the callable with exactly the arguments you want to pass, in the positions/keywords it makes sense to pass them in, your caller worries about how their functions actually get those arguments, and a few simple composable functions take care of doing just that one job well.
TL;DR: Modifying the order of passed in partial
objects' arguments harms the very benefits that partial
brings.
Wait I just realized! You can just write a generic wrapper that lets you give all ~Trio~ functions basically this behavior!
def with_args_to_handler(function):
def wrapper(handler, *args, **kwargs):
return function(partial(handler, *args), **kwargs)
return wrapper
Then you can just do:
serve_tcp = with_args_to_handler(trio.serve_tcp)
And then call it as you wanted:
serve_tcp(handler, foo, bar, port=80)
You can change the with_args_to_handler
wrapper to use an rpartial
implementation if you want the *args
to go in the back, but really I don't think it actually matters... So what if your signature is def handler(foo, bar, stream)
instead of def handler(stream, foo, bar)
? And I don't think it is ever necessary or even ever best to have a handler with the signature def handler(stream, *args)
.
Taking that last idea further: you could even provide such wrappers with Trio.
You could even provide a file within Trio that people could import, which has all the Trio functions wrapped like this.
I mean it needs a little more work for class methods or whatever, but you could generalize this to the point that Trio's code never needs to pass through *args
or **kwargs
, and yet that behavior is within reach for those who want it.
Then instead of a bunch of case-by-case design decisions, having to try to balance the wants of people who want to pass through *args
and people who want to pass through **kwargs
and people who want to not type partial
, and the need to have arguments that you don't pass through, and how to order them, you can just have a couple of wrappers that implement a few of these generic, sensible variants. And optionally provide them out of the box, or just document them in a "recipes" section somewhere.
Even if you leave places that already follow the "*args
for function, **kwargs
for our options" as-is because of convenience or inertia, it's a general strategy you can start using to solve this problem elsewhere.
And you can use the same strategy to solve the "but I wanna pass **kwargs
without typing partial
!" too:
def with_kwargs_to_handler(function):
def wrapper(handler, *args, **kwargs):
return function(partial(handler, **kwargs), *args)
return wrapper
def with_args_and_kwargs_to_handler(function):
def wrapper(handler, *args, **kwargs):
return function(partial(handler, *args, **kwargs))
return wrapper
But probably give them nicer shorter names, like pass_args
, pass_kwargs
, pass_args_kwargs
.
Edit: Note that you can't do these wrappers as generically and composably in the other direction! For every *arg
you have to convert into a **kwarg
and every **kwarg
you have to convert into an *arg
you have to specify the arguments by name. But if the base function just takes a callable and just the arguments it does not pass through, then these wrappers can be perfectly general.
Now a concise final reply to the main question: I slightly recommend against, because
Accepting port
as a positional argument reduces stuttering when the port is in a properly named variable, which it often ought to be.
p = functools.partial
is a thing people can do to reduce typing in cases where it matters enough.
Partializing or currying is trivial to do in Python and also eliminates partial
at the call sites.
Separate explicit partial applications scale better in most respects other than writing them.
You can just write a generic wrapper that gives all Trio functions this behavior composably.
Apparently @dstufft finds it annoying that you can't pass positional arguments through
serve_tcp
. He has a point! It is weird, when basically every other function like this in trio does allow you to pass-through positional arguments.There are two issues. First, say we pass
handler
and*args
toserve_tcp
. Does this ultimately call:or does it call
? @dstufft suggests that it should call
handler(stream, *args)
, because this allows you to easily write code like:which is much more awkward with the
handler(*args, stream)
form. That seems fairly convincing, though I'm still a bit nervous that people will be confused. (Even if one option is better once you think about it a lot, that doesn't mean it's obvious to a new user!)The other issue is that if we added this right now, we'd have calls like:
It's weird how the the final call gets assembled by taking the first argument, and then skipping some arguments, and then taking the rest.
Here @dstufft also points out that we could make these kwargs, so it'd be:
This is... pretty compelling actually, even on its own merit. Trio's general rule is that in
runner(callable, *args, **kwargs)
, the positional args belong tocallable
, and the kwargs belong torunner
. Movingserve_*
's configuration arguments to kwargs would make them much more consistent with the general rule. Plus writingserve_tcp(handler, 80)
has always kind of bothered me (what's that random magic constant?). Butserve_tcp(handler, port=80)
looks quite nice. I'm actually not sure why I didn't do it this way in the first place. Maybe because I'm still stuck in the past and forgot we can actually use mandatory kwargs now?If we do implement both of these changes then it'd be a kind of disruptive transition, where every
serve_tcp
call needs to convert fromserve_tcp(handler, port)
→serve_tcp(handler, port=port)
. But I do think we can do a regular deprecation warning, so it's not wildly out of line with other API adjustments we've made.So I'm... tentatively kind of thinking this might be a good idea? Not entirely sure yet. What do y'all think?