Open kennethreitz opened 6 years ago
Hmm, this has come up a few times, and it's a totally reasonable question – I think we need some sort of FAQ entry for this... maybe in the tutorial? Will think about it.
Anyway, here's the problem: a function like run
has to take the thing to run, and it also has to take its own arguments to configure how it runs it. So we need some way to distinguish which arguments are for which, but all the possible solutions involve some kind of awkwardness.
BTW, in ordinary usage you wouldn't use trio.run
to call request
, that wouldn't make much sense – you normally call trio.run
once at the very top of your program, so it's usually something like trio.run(main)
. (I guess you're looking at making synchronous shim wrappers or something?) But there are a bunch of functions in trio that have this same problem: nursery.start_soon
, run_sync_in_worker_thread
, etc. etc., so we do need a general solution.
What we do is recommend people do from functools import partial
and then trio.run(partial(request, method, url, pool=my_pool))
or whatever. I'm not a huge fan, but all the alternatives seem worse. Some advantages of this are: (a) it works uniformly for async and sync callables, (b) it's ordinary standard Python, so once you learn it you can use it everywhere, with or without trio, (c) partial
objects support introspection, so e.g. if you start a task like this and then later use trio's debugging interfaces to print out the task tree, then you'll see that trio has figured out that this task is running a function named request
, not one named partial
or <lambda>
or something, (d) it allows extra arguments to be added, which is kind of a niche thing but important for nursery.start
(note: different from nursery.start_soon
), (e) the syntax really isn't that bad, and in particular supports all of Python's normal funcall syntax like partial(request, *args, extra_arg, something=whatever, **my_kwargs)
. The left edge looks a little funny, but otherwise it's using function call syntax to represent a function call.
Rejected alternatives include:
YOLO it up: run
takes its arguments, any unrecognized arguments are passed on to the called function. But if we did this, then we could never add any new kwargs to run
(or similar functions) without breaking backcompat, which isn't workable in a foundational library like trio that needs to (eventually) be super careful about backcompat.
Reserve underscored kwarg names as being for trio.run
, and the rest get passed on to the function being called: this is kinda gross, and doesn't really solve the problem, because there's nothing that says an arbitrary function can't take kwargs that start with underscores. In particular, by implementing this we would be defining a dozen such functions! And some of them you might want to pass to each other in unusual-but-they-happen situations.
Use a special argument for arguments to run
, like def run(fn, *args, **kwargs, run_args={})
: this is super awkward to use and document (the signature no longer tells you what arguments you can pass in run_args
), plus still has the issue of potential kwarg name collisions (what if fn
has a run_args
argument).
Use a special arguments for kwargs for the function, like def run(fn, *args, *, fn_kwargs={}, ...)
: this is uglier to use and read than partial
(since now you have to write kwargs using dict syntax, while partial uses call syntax), and it's still an idiosyncratic special trick we'd be forcing people to learn. If we're going to make people learn an idiosyncratic trick anyway, partial
is much more generally useful.
Allow raw coroutine objects, like def run(fn(*args, **kwargs), ...)
: it doesn't solve the problem for variants that take sync functions like run_sync_in_worker_thread
, it doesn't work for more sophisticated nursery
-like objects that need to control how they run the function passed to nursery.start_soon
(think: Erlang-style supervisors that can restart services after a crash), and it would make it impossible for us to ever fix the super-annoying wart that new users always run into where they forget a mandatory await
and Python doesn't give them a proper error.
If you've got a better idea definitely do let me know but... what we do now really is the best option available, AFAICT :-).
a function like run has to take the thing to run, and it also has to take its own arguments to configure how it runs it.
Here's another option I don't see listed. What about not supporting passing configuration options directly to run()
at all? Instead, if non-default behavior is desired, the caller can instantiate, say, a TrioRunner
object with those options and use runner.run()
. It looks like you're almost doing that here already anyways, but with a Runner
class that has a slightly different responsibility, and perhaps not exposed as part of the public API.
There's something to be said for a clean function signature that doesn't mix arguments for async_fn
with arguments for run
. It also provides more options for configuring run()
and may age better, too, because you won't have to add more and more kwargs over time. (One of them is already called restrict_keyboard_interrupt_to_checkpoints
:), which might be better as an attribute.)
What about not supporting passing configuration options directly to
run()
at all? Instead, if non-default behavior is desired, the caller can instantiate, say, aTrioRunner
object with those options and userunner.run()
.
Interesting idea! And for trio.run
alone, this would probably work. But, the same problem happens for:
trio.run
nursery.start_soon
nursery.start
trio.run_sync_in_worker_thread
trio.BlockingTrioPortal.run
trio.BlockingTrioPortal.run_sync
trio.hazmat.spawn_system_task
trio.hazmat.TrioToken.run_sync_soon
And then trio-asyncio has the same problem for its functions to call trio-from-asyncio and vice-versa, and I'm sure this will keep cropping up around the ecosystem.
Adding a configobject interface for all of these seems really cumbersome? And I'd rather have just one solution to this problem that gets used everywhere, so you only have to learn it once, rather than two different ones so then you have to learn both solutions, and which solution is used where.
Okay, good to know. There's yet another hybrid pattern / approach you could use for all of these. Namely, rather than calling, say, runner.run()
and putting the config options in the runner
object, you could have a single options
argument that houses all of the config options and optionally pass that to run()
.
More generally, using this pattern means that you would have at most one reserved keyword argument name per function. The argument name could potentially be the same across all of the functions, though the class could vary. When you want to add more options to some function, you could add to the appropriate class without needing to touch any of the function signatures. Maybe you could even do something clever with an optional positional argument, so you wouldn't need to choose and reserve a name.
Ah, yeah, that's similar to the run_args={}
idea I mentioned above, except using a class instead of a dict to pass the options. That has the advantage that the class could use funcall syntax to specify the options (options=trio.RunOptions(a=1, b=2)
vs. options={"a": 1, "b": 2}
). You do have to define the class (though maybe just using a single Options
class that just saves its kwargs would be enough for everyone?), and with one reserved kwarg you still have the possibility for collisions.
Maybe you could even do something clever with an optional positional argument, so you wouldn't need to choose and reserve a name.
That's true, you could do something like:
def run(*args, **kwargs):
# Real code would need more fiddly error-checking but this should give the idea
if isinstance(args[0], Options):
options = args.pop(0)
fn = args.pop(0)
...
...and then it'd be used like:
run(fn, foo, bar)
# or
run(Options(clock=whatever), fn, foo, bar)
This feels very surprising though in the Python context.
Again, these can work but... the only real downside to partial
is that it forces people to learn one quirky convention, but everything else does that too.
(FWIW, I stole the partial
convention from asyncio, so I think that means it also has the Guido Seal Of Approval ;-).)
Maybe the tutorial should just get a short discussion of partial
+ a link to this thread for those who are curious about the details.
Thanks for summarizing. You stated it well. Yeah, I'm familiar with the functools.partial
approach required by asyncio
. But it always feels unsatisfying and never natural. For example, each time you're faced with the decision of whether to pass the *args
in the partial()
call, or pass only the **kwargs
and leave the *args
for run()
. The latter spreads the arguments across more than one invocation, which isn't ideal for code comprehension. And if the former should be used, then why not insist that partial()
always be used? That way there'd be only "one way." (Actually, some parts of asyncio's API do require partial
, like Future.add_done_callback()
.)
That has the advantage that the class could use funcall syntax to specify the options (options=trio.RunOptions(a=1, b=2) vs. options={"a": 1, "b": 2}).
It might be worth pointing out that you can also use funcall syntax for dicts, e.g. options=dict(a=1, b=2)
, though it wouldn't do argument name checking like a class would provide. I often do this for cases where the dict value will wind up as kwargs
for some function.
Incidentally, if you'll indulge me, I can't help mentioning a vague, not-fully-formed idea I've had for some time, which is for Python to expose an object that could be used to encapsulate *args
and **kwargs
in a single object (e.g. pkargs = Args(*args, **kwargs)
). It would be some marriage of a tuple and a dict and allow passing *args
and **kwargs
to a function using a single value (e.g. func(***pkargs)
or a similar syntax). Do you know if something like this has ever been proposed?
For example, each time you're faced with the decision of whether to pass the
*args
in thepartial()
call, or pass only the**kwargs
and leave the*args
forrun()
. The latter spreads the arguments across more than one invocation, which isn't ideal for code comprehension.
I guess my convention is to always use either run(fn, arg1, arg2, run_arg=...)
or run(partial(fn, arg1, arg2, kwarg=something), run_arg=...)
. I agree that run(partial(fn, kwarg=something), arg1, arg2)
would be very weird style.
And if the former should be used, then why not insist that
partial()
always be used? That way there'd be only "one way."
Yeah, it really comes down to taste. On balance I think the advantages of being able to write run(fn, arg1, arg2)
are enough to make it worthwhile, but if you'd rather write run(partial(fn, arg1, arg2))
then go for it I guess :-).
(Actually, some parts of
asyncio
's API do require partial, likeFuture.add_done_callback()
.)
¯\_(ツ)_/¯
It would be some marriage of a tuple and a dict and allow passing
*args
and**kwargs
to a function using a single value (e.g.func(***pkargs)
or a similar syntax). Do you know if something like this has ever been proposed?
Huh, no, I haven't seen that idea before.
Personally I'd like to see python add syntax-level support for partial application, then the equivalent of run(partial(fn, arg1, arg2))
is always what happens.
@buhman if you can convince python-dev of that then we can certainly revisit this :-).
Duplicate of #88, similar to python-attrs/attrs#22.
(FWIW I'm very strongly in favor of partial
instead of adding **kwargs
)
Aliasing partial
(e.g., as P
) can reduce the pain and visual clutter:
from functools import partial as P
trio.run(P(func, a, b, c=42))
I have suggested, half in jest, that trio should add trio.withargs
or something as an alias for functools.partial
, to make it feel less weird and scary.
It might actually be worth considering seriously.
There was a substantial discussion/brainstorming session about this in chat yesterday, starting around here: https://gitter.im/python-trio/general?at=5c19957cb8760c21bbed7ee9
I think there are 3 reasons there's ongoing tension here: (1) intuitively it seems like people want to pass through kwargs to their function a lot more than they want to use the relatively obscure configuration arguments like start_soon(..., name="myname")
or run_sync_in_worker_thread(..., cancellable=True)
so it feels like passing through kwargs should be the shorter/easier to read version, and it isn't. This is something we could check by looking at real code. (2) this creates pressure to prefer positional arguments in new APIs, which on the margin means sacrificing other considerations, like readability, (3) partial
is jargony and obscure and requires an extra import, which are bad things in a common operation.
Some ideas from the thread:
nursery.start_soon(fn, args(posarg, kw=whatever))
the One Way to call things while passing args.nursery.start_soon[options(name="thing")](fn, posarg, kw=whatever)
. Seems weird and syntax is similar to how PEP 484 does generic constructors (e.g. l = List[int]()
).nursery.start_soon.add_options(name="thing")(fn, posarg, kw=whatever)
nursery.start_soon(fn, posarg, kw=whatever, start_soon_name="thing")
(or trio_name
or something, maybe, though then we'd lose the ability to error out if someone passes an unrecognized kwarg from the future)nursery.with_options(name="thing").start_soon(fn, posarg, kw=whatever)
nursery.start_soon
and nursery.start_soon_configured
where the former only accepts arguments to fn
, and the latter looks like what we have now.Occasionally there are rumblings about adding some kind of macro/quoting syntax to Python. If that happens I guess we could have nursery.start_soon!(fn(posarg, kw=whatever), name="thing")
, which would look lovely. But it would require stack introspection on every call, which might be too expensive for this.
None of these strike me as obviously superior to what we're doing now, but there are some new ideas I wanted to capture.
I kinda like nursery.with_options(...).start_soon
, actually.
More brainstorming in chat: https://gitter.im/python-trio/general?at=5c4a4449f780a1521f638f59
Put my 2¢ in @ https://gist.github.com/njsmith/ad4fc82578239646ccdf986ae3ca07c1
Curious if anyone else thinks that's a good api?
s/config/options/with_options/add_options
depending on taste.
Another idea, see https://gitter.im/python-trio/general?at=5c4b3e3e20b78635b67fa78e
Basically this would allow you to call .start_soon(fn, *args, **kwargs)
or .start_soon(**start_soon_options)(fn, *args, **kwargs)
, which is syntactically as minimal as you can get
It's certainly possible to switch behaviour on whether or not fn
is passed in but that's probably too much black-magic for my tastes.
The verbosity of options/config/with_options does grate a little but it has the benefit of being pretty explicit and obvious.
Another api I was considering was forcing users to use a method to call the func, thereby leaving the __call__
free to configure the options - e.g.
class takes_callable(object):
def __init__(self, wrapped_fn, **options):
self.wrapped_fn = wrapped_fn
self.options = options
def call(self, fn, *args, **kwargs):
return self.wrapped_fn(
partial(fn, *args, **kwargs),
**self.options
)
def __call__(self, **options):
return takes_callable(self.wrapped_fn, **options)
In [41]: @takes_callable
...: def run(fn, **options):
...: print(f"options = {options!r}")
...: return fn()
In [42]: def func(*args, **kwargs):
...: print(args)
...: print(kwargs)
In [43]: run.call(func, 1, 2, q=3)
options = {}
(1, 2)
{'q': 3}
In [44]: run(clock=None).call(func, 1, 2, q=3)
options = {'clock': None}
(1, 2)
{'q': 3}
I think this might be a cleaner api but also less obvious/discoverable
IMO the run*()
/start*()
/spawn*()
methods all need to transform a function call (function + parameters) into a task that will be run/started/spawned sooner or later with possibly some additional options on how to do that.
Splitting the spec of the function call over multiple parameters of the run*()
/start*()
/spawn*()
methods is not a good idea as this is something belonging tightly together and that is what partial(fn, *args, **kw)
does, storing the parameters in a single object though the name of the method is sub optimal for trio's purpose.
As trio does not necessarily need the __call__
functionality it could just use partial()
for storing the info and take fn
, *args
and **kw
from that object when creating the underlying Task()
so that partial()
is not on the call stack.
trio
could import partial
under a different name (or declare a class that just stores its arguments in its __init__()
, unfortunately Task()
is already taken and FuncCall()
a bit long, perhaps FC
?,
E.g. start_soon(async_fc, name=None)
with async_fc = FC(async_fn, *args, **kw)
would allow to extend the run/start/spawn* methods by other positional or keyword parameters.
Doing some paid work tonight, I'm being annoyed again by having to use functools.partial
every time I call run_sync_in_worker_thread
.
I think at this point we can be fairly confident we've exhausted all the different combinations of .with_options
and variants, and they're all pretty awful.
I'm thinking about revisiting one of the ideas we rejected early on: of using underscores to mark configuration-kwargs, versus no-underscore to mark passthrough-kwargs. So example usage would be:
await trio.run(main, _clock=MyClock)
nursery.start_soon(async_fn, arg1, arg2, kwarg=foo, otherkwarg=bar, _name="something")
await trio.run_sync_in_worker_thread(container.logs, stdout=True, stderr=False, _limiter=my_limiter)
await serve_tcp(handler, _port=80)
await serve_ssl_over_tcp(handler, _port=443, _ssl_context=my_context)
One downside is that it's weird-looking. But I guess we'd get used to it? I think serve_tcp
and serve_ssl_over_tcp
are the only two functions where we want to pass both an arbitrary callable + mandatory kwargs (see #563), so they're the most affected. It does seem easier to explain and to use than all the .with_options(...)
ideas proposed above. I'm not super-excited about the looks, but I can imagine we might decide that using partial
all the time is annoying enough to overcome this downside.
The more technical potential downside is lack of compositionality/universality: what if we want to pass through a kwarg that starts with an underscore? For example, nursery.start_soon(trio.run_sync_in_worker_thread, sync_fn, _cancellable=True, _name=sync_fn)
– notice that _cancellable
is supposed to go to run_sync_in_worker_thread
, while _name
is supposed to go to start_soon
. I think our full set of options are:
Pass through unrecognized kwargs, even if they start with underscore: this is definitely not workable, because it would mean that if we ever added any new kwargs to these functions it would be a breaking change.
Use some sort of quoting rule: for example, any kwarg with double-underscores gets one underscore stripped off, and then is passed through. So our example becomes:
nursery.start_soon(
trio.run_sync_in_worker_thread, sync_fn, __cancellable=True, _name=sync_fn,
)
Or, slightly more complicated but maybe nicer to look at, we could look at the kwargs, and of them, all the ones with the maximum number of underscores get taken by the top-level fn, and the rest get passed through. So our example would become:
nursery.start_soon(
trio.run_sync_in_worker_thread, sync_fn, _cancellable=True, __name=sync_fn,
)
Document that if they really want to pass-through underscored kwargs, they should use partial
:
nursery.start_soon(
partial(trio.run_sync_in_worker_thread, sync_fn, _cancellable=True),
_name=sync_fn)
)
Given how rarely this situation arises, handling it with partial
doesn't seem like a huge deal. Also, if we started with this option, then we'd have the option to later extend it to one of the quoting options above, without breaking backcompat. So if we're going to do this at all, then this seems like the variant to start with.
The implementation would be extremely simple. E.g.:
async def run(async_fn, *args, **kwargs, _clock=None, _instruments=()):
check_no_underscores(kwargs) # raises an error if any keys start with _
...
No magic decorators, so it's totally understandable by naive readers, sphinx, mypy, and pylint. The one wrinkle is that mypy/pylint wouldn't know to flag trio.run(..., _clokc=...)
as an error, but that doesn't seem like a huge issue. (And if we really want to then we could fix it with plugins. It seems like we'll have mypy/pylint plugins in any case, to handle things like trio.open_file
and missing-await detection – see #671 – so adding this feature wouldn't be a huge extra cost.)
Oh, wait, there's a second kind of compositionality problem. There are places where trio generates a kwarg: in particular, nursery.start
automagically passing task_status=...
. And I guess serve
functions might end up passing through the stream as a kwarg too (#563), like handler(peer=stream_obj)
. I think these are the only two places where this comes up.
But, it's not a theoretical problem: the serve
functions both take both a callable and also support task_status
. So if we say that all their kwargs have to start with _
, AND ALSO say that they have to take a kwarg called task_status
, then that's a problem!
What options do we have available?
The serve
functions could go ahead and take task_status
, even though it's not underscored. It would be a bit awkward to explain why we were breaking our own rule, but not necessarily a big deal? task_status
is already effectively a reserved kwarg name in trio; it doesn't really make sense to have an incoming connection handler that takes task_status
.
We could switch the start
protocol to use _task_status
as the name of its magic kwarg. This would be an awkward transition b/c there's no good way to deprecate the old way. (I guess our options would be (a) rename start
, (b) do a two-round deprecation where we force everyone to add a new_style=True
switch and then force them to remove it again.) But eh, given that it's effectively a magic reserved name anyway, the underscore wouldn't look that weird. This would help the serve
functions. Are there any cases that go the other way, where you want to pass a function-taking-a-function to start
, and have task_status
passed through? Of course, we don't support that now either...
Are we worried about similar issues for the peer=
/client=
/whatever-we-call it arg from #563? I'm having trouble thinking of any situation where you'd want to pass a function-that-takes-a-callable to serve
. Maybe run_sync_in_worker_thread
, but then you'd want the peer=
to be passed through. Probably someone can come up with something, though?
I don't feel like I've fully wrapped my head around the issues here.
Meh. I still think that nursery.start_soon.mod(name="foo")(fn, *args, **kwargs)
would work best ("mod" being the shortest name I could think of that still means something reasonable) given that nursery.start_soon(name="foo")(fn, *args, **kwargs)
seems to be too magical for some people's taste. Modifiers on fn
can be achieved very easily, by using fn.mod(bar='baz')
.
I don't like magic underscores; "internal use only" is not the same as "strip the under and feed me to the next part". I'd hate to be required to again resort to functools.partial
except now in a different set of circumstances.
I'd leave task_status
alone. Its default value wants to be renamed, but that's a different issue.
IMHO having _
or __
prefixed kwargs is a horrible kludge and think that using partial is preferable to that - i.e. the cure is worse than the disease!
I personally like @smurfix's solution the best (maybe s/mod/config/
). The verbosity doesn't really bother me and, to me, is greatly preferable to _
prefix kwargs being implicitly passed as config.
The discussion around PEP 570 made me realize an interesting corner case... even if we have a signature like:
async def run(async_fn, *args, **kwargs):
...then technically it's not quite true that you can pass through any kwargs. Specifically, you cannot pass through a kwarg named async_fn
:-)
This may not really matter in practice (in the unlikely case that someone hits it they can use partial
), but I found it surprising.
Here is a perspective that I don't rigidly or fully stand by but which is important and I haven't seen said here:
The learning curve for partial
could be massively reduced if:
every single function like trio.run()
just stopped taking *args
to pass to the async function as well,
the documentation and examples as a result showed partial
being used all over the place,
(if needed) the documentation linked to a small and really intuitive summary of what partial
is and why it is good.
Every time you force people to use partial
you make the world a little better, because:
partial
is amazingly versatile and uses for it tend to turn up in many unexpected places,
developers who "think with" partial
are thus more versatile developers,
passing arguments along with a callable is a distinct code feature, extra functionality, and partial
properly pulls and abstracts just that feature into its own composable form, making it unnecessary for people to keep reimplementing it,
partial
makes it easier to
with less mental overhead, without having to jump around to other parts of the code or documentation to verify what happens to the arguments being passed through.
Personally, both in my own code and at work,
I intentionally avoid, change, and push back against any interface that accepts arguments along with callables anywhere that partial
will do, and
I push for partial
over functools.partial
, even as I advocate for everything else to be properly namespaced, because the latter is a pain to write and because partial application is a fundamental operation, like +
.
And I have been happier ever since, enjoying in particular a greater smoothness of thought due to no longer having to do a depth-first search of the possible future effects and uses of this kind of argument passthrough.
I do get frustrated having to write calls to partial
sometimes, but then I remember that code is read more than it is written, and that my insistence on partial
-only argument passthrough has significant return on investment in people being able to
with less overhead. And that makes the upfront nuisance of writing partial
not feel so bad.
TL;DR: forcing people to always use partial
is good in some significant, permeating ways that are really worth weighing here.
I thought of yet another possibility recently: Say that if you want to pass Trio options, you need to pass the function to call through a kwarg too (target=
?), and use partial
if it takes arguments. If you don't want to pass Trio options, you can pass the function and its positional and keyword arguments without a layer of partial
. We distinguish the two cases based on whether we got any positional arguments.
So you'd have nursery.start_soon(fn, arg1, kwarg=value)
but nursery.start_soon(target=partial(fn, arg1, kwarg=value), name="hi")
.
This can be represented nicely in type stubs using @overload
for the benefit of static checkers, and is at least reasonably amenable to a clean deprecation.
Downside: it doesn't play very nicely with #563. :-(
@oremanj That's a pretty neat idea actually, although based on my experience my immediate intuition is that this feels rather likely to trip people up, especially because it's harder to implement it in combination with #563 (speaking of, I wrote some thoughts on that issue that are relevant here, thanks for pointing that issue out).
Like I'm picturing a situation where someone has nursery.start_soon(fn, arg, kwarg=kwarg)
and decides "don't need to pass arg
anymore" and deletes it and moves on - because that's the cognition flow that normally works when we no longer need to pass an argument. So to get it right everyone would have to train an additional thought habit: "oh wait does this function completely change how it interprets any of its other arguments based on the the presence of a different argument?"
But developing and diligently using such mental habits is costly, and maintaining that mental habit as a general case would be an uphill battle because almost all of the time the answer would be "no" and it would be time wasted, so the mind would naturally try to unlearn it. And if you had to special case it to certain functions that would be additional mental cost, and would require more proactive, out-of-the-way mental rehearsals of the memory/"cognition snag" that would trigger that mental check in response to it being one of the functions that it applies to.
And that's the kind of situation shape that in my experience leads people to make code changes that they think are so obviously correct that they don't notice were actually wrong until some time/confusion/debugging/deployments later.
Type checkers that can handle overloads would catch it, but in many situations people either don't use them, or only use them at the end of a lot of different modifications, or frequently but still manually, and in any case it would be a jarring "wait... what? why?" experience.
So after further thought I would advise against, but it was good creative thinking.
Looking at this again after a long absence (partly motivated by https://github.com/python-trio/trio/issues/1104#issuecomment-630000695), I'm wondering why we never considered borrowing Python's dunder convention:
trio.run(f, x=1, __clock__=custom_clock)
nursery.start_soon(f, x=1, __name__="custom name")
await trio.to_thread.run_sync(f, x=1, __limiter__=custom_limiter)
A key thing here is that every one of these kwargs is an exotic feature that beginners will never see. AFAIK there are only two exceptions to that, which are the port
argument to serve_tcp
and the port
and ssl_context
arguments to serve_ssl_over_tcp
(and tbh the latter is already a semi-advanced topic, because configuring SSLContext
s is not for the faint of heart). OTOH serve_tcp
does show up in your classic 10-line echo server that you put on the top of your README.
But we can solve that by making it:
await trio.serve_tcp(port, handler, *args, **kwargs)
(i.e., flipping the position of port
and handler
compared to what we have today).
The one obnoxious bit is backcompat, but I guess this is actually not too bad, because port
is always an integer and handler
is never an integer, so you can use a runtime type check to detect when someone has the arguments backwards, issue a warning, and flip them the right way around.
Ooh, I like using dunders a lot better than just leading underscores.
Friendly amendment: can we use _sunder_
names rather than __dunder__
names? There's precedent (enum
in the standard library), it's a little less noisy/magical-seeming, and it's still very unlikely to clash with a normal function kwarg.
The reordered serve_*
arguments actually read better than the original to me: "serve_tcp
on port using handler". We could similarly do "serve_ssl_
with context over_tcp
on port using handler", and "serve_listeners
these using handler".
@mentalisttraceur
Like I'm picturing a situation where someone has
nursery.start_soon(fn, arg, kwarg=kwarg)
and decides "don't need to passarg
anymore" and deletes it and moves on
This would still fall into the "pass all arguments through to fn" case though, because start_soon
still has one positional argument: the function itself! I'm proposing that if you want to specify options for start_soon()
, you have to wrap the function and its arguments up in a single callable (probably a partial()) and pass that via a kwarg (target=
) too, so that the start_soon() call has no positional arguments at all.
I agree that the version that doesn't let you easily call "fn with only kwargs" would be super confusing and non-orthogonal.
@njsmith I'm curious for your thoughts on this approach. I still think I prefer it a bit versus using _sunder_
names, but I could go for either.
Either __dunder__
or _sunder_
kwargs seems IMHO like a very ugly solution :(
I would personally prefer the status-quo over either. Using partial
isn't that great of a burden - I think this is a case of the cure being worse than the cold.
A key thing here is that every one of these kwargs is an exotic feature that beginners will never see.
I think that this comment points to the fact that it isn't a great solution - i.e. it's really ugly, but no one will have to see it anyway. If it's not a great solution for your average user, I don't think we should be foisting it on even expert users - if we can get away without doing so.
A further red-flag for this design is the fact you pointed out that it will actually be exposed in one of the most common usages - specifying a port in serve_tcp
. Whilst you point out that this could be avoided by changing the order of the argument from that which would normally be expected I think at this point we should really consider if it's actually providing any benefits:
trio.run(partial(f, x=1), clock=custom_clock)
trio.run(f, x=1, __clock__=custom_clock)
Out of the two api's above I vastly prefer the former (current) - but maybe that's just me?
If we're bound and determined to get away from using partial
I'll put in one more plug for @smurfix's solution above (https://github.com/python-trio/trio/issues/470#issuecomment-473648115) of using a custom .mod
(or .cfg
) method:
trio.run(f, x=1)
trio.run.cfg(clock=custom_clock)(f, x=1)
nursery.start_soon(f, x=1)
nursery.start_soon.cfg(name="custom name")(f, x=1)
await trio.to_thread.run_sync(f, x=1)
await trio.to_thread.run_sync.cfg(limiter=custom_limiter)(f, x=1)
await trio.serve_tcp(handler, *args, **kwargs)
await trio.serve_tcp.cfg(port=port)(handler, *args, **kwargs)
@oremanj In general I'm not a big fan of APIs where there are two disjoint modes you have to learn, and then learn which one to use when. Try writing out the docs to explain how to use your system – can you make it shorter/clearer than this?
trio.run(async_fn, *args, **kwargs, __clock__=None, ...)
Runs
await async_fn(*args, **kwargs)
.Arguments used to control
trio.run
itself have__double_underscores__
around them, to avoid colliding with arguments passed toasync_fn
.Args:
__clock__
(trio.abc.Clock
orNone
): This argument is used to...
We'd need a bit more text somewhere to explain how to define your own args+kwargs functions, and mentioning partial
as a workaround for the unlikely case where you actually need to pass a __dunder__
argument to the inner function, but that could be buried in the reference docs; these are issues that 99% of users will never need think about.
Re: _sunders_
versus __dunders__
: I'm not an immediate fan of sunders, mostly because I'd never heard of them until your post :-). Looking around, it looks like a semi-obscure thing that only enum
uses? or maybe there are some obscure IPython-related APIs that use it too? Googling python dunder
gets me lots of detailed posts, while python sunder
gets me stuff like this and this :-).
As far as visual noise goes, I don't think the difference between sunders and dunders is very significant. Once you have underscores at all, you've committed to having a big ol' visual flag that there's something magical going on here; the exact size of the flag is a pretty minor detail. And the nice thing about dunders is that Python programmers are already trained to think of them as a flag meaning "this thing has special-case magic semantics". OTOH if I saw a sunder I'd just be like "that's... weird-looking, is that like a dunder? is it a typo? what is that?".
When you see def __add__(self, other): ...
, do you think "ugh, that's so much noise, I wish there was only one underscore", or do you think "okay, yeah, this class is doing something exotic, let's see how it works".
@dhirschfeld
I'm not sure that partial
is actually more comprehensible to the median new Trio user as compared __dunder__
s. And the problem with partial
is that you need to pass kwargs constantly, while you hardly ever use a custom clock. So for 99% of users, the relevant comparison is
trio.run(partial(f, x=1))
trio.run(f, x=1)
Also, re:
await trio.serve_tcp(handler, *args, **kwargs) await trio.serve_tcp.cfg(port=port)(handler, *args, **kwargs)
port
is a mandatory argument, so that first line doesn't make sense. And I'm not putting await trio.serve_tcp.cfg(12345)(handler)
at the top of our README :-). So even if we did go with .cfg
, then we'd still want to make it serve_tcp(port, handler, ...)
.
I was assuming port
had a default. If you were going to specify it in a cfg
method I'd be inclined to require it to be a kwarg so that it's self documenting:
await trio.serve_tcp.cfg(port=12345)(handler)
...apparently my tastes tend towards the more verbose 😄
Try writing out the docs to explain how to use your system – can you make it shorter/clearer than this?
This is a great exercise, thank you for the suggestion! I realized that target
-as-kwarg is indeed hard to communicate, but I came across another approach that I think can be communicated quite nicely:
trio.run(async_fn, *args, **kwargs)
trio.run(trio.call(async_fn, *args, **kwargs), clock=None, ...)
Runs
await async_fn(*args, **kwargs)
.The second form allows you to customize the behavior of
trio.run()
by passing any of the keyword arguments described below. (Thetrio.call()
wrapper lets Trio distinguish the arguments forasync_fn
from the options fortrio.run()
.) The first form is shorthand for when the default behavior oftrio.run()
is fine, which it usually is.Args:
clock
(trio.abc.Clock
or None): This argument is used to...
trio.call
is a thin subclass of functools.partial
.
Why a new name and not just use partial()
?
partial()
were sufficiently discoverable and obvious, our users would be using it and we wouldn't need this issue. It's already linked in the Trio documentation.trio.call
in the synopsis of each function that uses this approach can be a hyperlink.Why a subclass and not just an alias?
async_fn
in a decorator that returns a partial
object.trio.call
should be invocable with just the listed arguments, while a functools.partial
might need more arguments to be specified. (task_status
confounds this a little bit, but dealing with that one special case is much nicer than dealing with arbitrary possible additional arguments.)__del__
that prints a warning if the object wasn't passed to an appropriate Trio function, so people don't get too confused thinking trio.call
is a verb rather than a noun.Why a subclass and not a wrapper?
partial()
is implemented in C on recent CPythons. Subclassing imposes less of a performance penalty for common operations.partial()
is already relatively subclassing-friendly; for example, its __repr__
uses its actual type name rather than the string "partial".Can someone remind me of where the wisdom for not taking already initialized coroutine objects is documented?
Because every time this discussion comes up I can't help but notice that if trio did take pre-created coroutine objects, all of this would be a non-issue because they just bundle in their own arguments:
trio.run(async_fn(foo, bar, qux=0), ...)
Trio strives to be usable by folks who don't know anything about async/await besides "async functions are like regular functions, except you need await to call them", so we treat async and non-async functions the same.
This makes it easier for non-experts to understand the code, and also for static analysis tools to catch common mistakes like missing await.
Also, some of the APIs we're talking about actually take non-async functions, like trio.to_thread.run_sync, so we can't just rely on coroutine objects, even if we wanted to :-)
On Tue, May 19, 2020, 16:05 mentalisttraceur notifications@github.com wrote:
Can someone remind me of where the wisdom for not taking already initialized coroutine objects is documented?
Because every time this discussion comes up I can't help but notice that if trio did take pre-creared coroutine objects, all of this would be a non-issue because they just bundle in their own arguments:
trio.run(async_fn(foo, bar, qux=0), ...)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/python-trio/trio/issues/470#issuecomment-631132907, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEU42BSCC7ZQSVVWH6HJH3RSMGBRANCNFSM4EVWXYJA .
Restating a concept I mentioned in #563 that feel very relevant here:
Imagine trio
provides the following decorators:
def pass_args(wrapped):
def wrapper(function, *args, **kwargs):
return wrapped(partial(function, *args), **kwargs)
return wrapper
def pass_kwargs(wrapped):
def wrapper(function, *args, **kwargs):
return wrapped(partial(function, **kwargs), *args)
return wrapper
def pass_args_kwargs(wrapped):
def wrapper(function, *args, **kwargs):
return wrapped(partial(function, *args, **kwargs))
return wrapper
Then the use-case in this issue is solved with one line:
run = trio.pass_kwargs(trio.run)
Boom. User wants kwarg passthrough? One line to "enable" it, and now they have it.
P.S. Re: @njsmith replying to my earlier comment: Okay, I see, it keeps a symmetry between sync and async functions - that makes sense. Hopefully I'll remember that this time. Thanks!
Ohh, right, @njsmith even summarizes the problems with taking coroutine objects up at the top here - and that includes not only what he just said in reply to me but also support for things like supervisors, which want to be able to re-run the function they were given. Okay, yeah, it's all coming back now.
Another idea:
Redefine trio.run
to take no options - pure passthrough of both *args
and **kwargs
.
Define a trio.run_with_options
, which does no argument passthrough - just takes the callable and options for itself.
I think this covers the common case in a very user-friendly way, and the advanced/unusual cases in a simple way. Both very regular: no special cases.
This approach can also be done with nursery.start_soon
, and maybe other Trio APIs.
Oh, a relevant reply to this old comment in this issue by @njsmith :
There is a workaround for making sure truly any keyword argument is passed through:
def run(*args, **kwargs):
async_fn, args = args[0], args[1:]
...
Not ideal for any documentation/introspection that relies on the signature, but it does solve the issue robustly on older Python versions that don't have the positional-only arguments feature.
@oremanj
This is a great exercise, thank you for the suggestion!
Huh, is that really the first time I mentioned it? It's one of the major heuristic I have running in the back of my head all the time :-).
trio.run(async_fn, *args, **kwargs) trio.run(trio.call(async_fn, *args, **kwargs), clock=None, ...)
Hmm. I'll think about it more, but some initial impressions:
It is nice that it keeps the common case of no-arguments nice and clean.
Getting docs to look like that will require a custom sphinx extension. Writing and maintaining sphinx extensions is not my favorite activity :-).
It definitely feels weird to say that in these two calls, the clock=
argument is completely different and unrelated, even though it's being passed to the same function:
trio.run(f, clock=1)
trio.run(trio.call(f), clock=MockClock())
In fact, the type of the clock=
argument depends on the type of the first argument. Doesn't this make it harder to mypy-style typechecking (and maybe other kinds of static analysis too)? Mayybe you can do something with @override
? Having a single __clock__
kwarg that always means the same thing and always has the same type seems like it would avoid some complexity here.
Getting docs to look like that will require a custom sphinx extension. Writing and maintaining sphinx extensions is not my favorite activity :-).
I agree that figuring out how to represent this nicely in the Sphinx output is an important precondition for having it be the approach we take. I think I can figure it out, and am offering to do that work if we think it's the best way forward in other respects.
In fact, the type of the clock= argument depends on the type of the first argument. Doesn't this make it harder to mypy-style typechecking (and maybe other kinds of static analysis too)?
This can be represented to mypy just fine with an @overload
(once mypy understands "this takes args and kwargs that match its argument" at all, but that's required for any of the proposed options -- it's PEP 612, and work appears to be underway on supporting it in mypy).
It definitely feels weird to say that in these two calls, the clock= argument is completely different and unrelated, even though it's being passed to the same function:
I think we should raise a warning in the trio.run(f, clock=1)
case, because it is confusing and you can always work around it using trio.call()
. People will need to avoid these sorts of constructs for a few releases no matter what approach we choose, because trio.run(f, clock=1)
is currently valid and sets the clock for trio.run() rather than the argument for f().
I think we should raise a warning in the
trio.run(f, clock=1)
case, because it is confusing and you can always work around it usingtrio.call()
. People will need to avoid these sorts of constructs for a few releases no matter what approach we choose, becausetrio.run(f, clock=1)
is currently valid and sets the clock for trio.run() rather than the argument for f().
Hmm, this could be a problem. So obviously, we'll need some temporary transition plan – probably phase 1, continue to accept clock=
and friends as now but with a warning, phase 2, don't accept the old-style kwargs at all, phase 3, start accepting pass-through kwargs. But this should be a temporary thing.
If you think that trio.run(f, clock=1)
should raise a warning indefinitely because it's confusing, then that pretty much nixes this approach: the point of having a general rule for matching kwargs with the target function is to make sure we can add new kwargs later without disrupting existing code. But if adding new kwargs to means that old usages become confusing and need to start emitting warnings, then we'll have failed at that.
If you think that
trio.run(f, clock=1)
should raise a warning indefinitely because it's confusing, then that pretty much nixes this approach: the point of having a general rule for matching kwargs with the target function is to make sure we can add new kwargs later without disrupting existing code. But if adding new kwargs to means that old usages become confusing and need to start emitting warnings, then we'll have failed at that.
I'm not especially attached to "raise the warning indefinitely", but yes, that was my initial thought. Reasoning: People who wrote trio.run(f, newarg=1)
to mean f(newarg=1)
before trio.run()
accepted a newarg=
option will still have their code do the same thing as before. They get a warning when they upgrade, the fix for which is clear, and their code will never stop doing what they originally meant even if they ignore it. The target audience for the warning is people who write trio.run(f, newarg=1)
after trio.run() starts accepting newarg
. It encourages them to think about whether they mean trio.run(trio.call(f, newarg=1))
or trio.run(trio.call(f), newarg=1)
. While it's important that adding new options not break existing code, I think imposing a small burden on the minority of users calling a function with a kwarg that collides with the new option is an acceptable tradeoff in order to improve the experience for new users. But if you disagree, I think the overall proposal stands fine without the warning plan.
Alternative approach that makes the options case even more verbose but avoids the theoretical ambiguity:
trio.run(f, 42, kw="hi")
trio.run_options(trio.call(f, 42, kw="hi"), clock=MockClock())
More options for bikeshedding:
trio.run(trio.call(f, 42, kw="hi"), trio.options(clock=MockClock())
trio.run(f, 42, kw="hi", trio_options=dict(clock=MockClock()))
Comparison of a more complex example:
# dunders (optionally replace trio.call with partial)
nursery.start_soon(
trio.call(
trio.to_thread.run_sync,
trio.call(token.run_sync_soon, foo, __idempotent__=True),
__cancellable__=True,
),
__name__="foo",
)
# my proposal
nursery.start_soon(
trio.call(
trio.to_thread.run_sync,
trio.call(token.run_sync_soon, trio.call(foo), idempotent=True),
cancellable=True,
),
name="foo",
)
# functions take trio.call and trio.options
nursery.start_soon(
trio.call(
trio.to_thread.run_sync,
trio.call(token.run_sync_soon_options, trio.call(foo), trio.options(idempotent=True)),
trio.options(cancellable=True),
),
trio.options(name="foo"),
)
# with _options on the function name
nursery.start_soon_options(
trio.call(
trio.to_thread.run_sync_options,
trio.call(token.run_sync_soon_options, trio.call(foo), idempotent=True),
cancellable=True,
),
name="foo",
)
# with options bundled into a trio_options= kwarg
nursery.start_soon(
trio.call(
trio.to_thread.run_sync,
trio.call(token.run_sync_soon, foo, trio_options=dict(idempotent=True)),
trio_options=dict(cancellable=True),
),
trio_options=dict(name="foo"),
)
I agree that the dunder approach is in some ways less confusing, but I think an approach that a user finds ugly will grate on them just as much as will an approach that they find mildly confusing, and possibly moreso long-term (confusion fades as understanding increases, ugliness not so much).
I think the example @oremanj used in that last comment does a good job of showing why my pass_kwargs
idea was not very good for consistently and generally solving this recurring problem throughout Trio's APIs.
Because as soon as we try to do that example with pass_kwargs
, the actual options have to be partially applied to the underlying Trio function first, then that partial object has to be ran through pass_kwargs
.
Granted, the pass_*
functions were meant to just cover the common case and the case where you are calling the same API enough to want shorthand for your common cases, but I think the weakness versus the other proposals here for examples like the one just given is worth putting into focus.
I think having seen and considered this, I probably/tentatively think my pass_*
suggestion is not a sufficient solution by itself for what users are wanting here, even if it is used as part of the solution.
Some discussion on typing-sig about PEP 612, and what it will/won't let us do for typing functions like trio.run
:
https://mail.python.org/archives/list/typing-sig@python.org/thread/AP3PQVD2QT2WB7XIKG5RFZ3DO6OXL5ZH/
Remember the idea someone else mentioned to add an .options
attribute onto every Trio method or function that takes the options, and returns a callable which just does that operation with those options applied?
I can't help but notice that with that approach, the nested example given by @oremanj looks like this (unless I misunderstood what that nested example does, but if so then I think that is a big point against those other methods):
nursery.start_soon.options(name="foo")(
trio.to_thread.run_sync.options(cancellable=True),
token.run_sync_soon.options(idempotent=True),
foo,
)
Which I think is just much nicer than any of the other alternatives presented. In particular, it's very clear and impossible to make unclear that the options are options and which thing they are options to.
The implementation can look like this:
# Using PEP-570 positional-only parameters:
def _run(options, async_fn, /, *args, **kwargs):
...
# Or portably (probably okay for internal functions?):
def _run(*args, **kwargs):
options, async_fn, args = args[0], args[1], args[2:]
...
def run(async_fn, *args, **kwargs):
return _run(_default_options, async_fn, *args, **kwargs)
def _run_options(clock=None, instruments=(), restrict_keyboard_interrupt_to_checkpoints=False):
options = ... # however this is done
return functools.partial(_run, options)
run.options = _run_options
Options should probably be an attrs
class. Doesn't really matter, but having it as a well-defined class would also allow for a nice internal decorator so that all of the above could be reduced to:
@attr.s
class _RunOptions:
"""docstring that we want on ``run.options``"""
...
@_options(_RunOptions)
def run(options, async_fc, /, *args, **kwargs):
"""docstring that we want on ``run``"""
...
(I can draft up an implementation of _options
if there is interest in going this route.)
@oremanj
I agree that the dunder approach is in some ways less confusing, but I think an approach that a user finds ugly will grate on them just as much as will an approach that they find mildly confusing, and possibly moreso long-term (confusion fades as understanding increases, ugliness not so much).
Do you find that writing def __add__(self, other): ...
or referencing obj.__name__
, etc., grates on you and makes you dislike using python?
@mentalisttraceur If we do go that way then I think we can make the implementation simpler by having the decorator package up the callabe+args into a single argument for the underlying function:
@attr.s(frozen=True, slots=True)
class takes_callable:
wrapped = attr.ib()
def __call__(self, fn, *args, **kwargs):
return self.wrapped(partial(fn, *args, **kwargs))
def options(self, **options):
return lambda fn, *args, **kwargs: self.wrapped(partial(fn, *args, **kwargs), **options)
@takes_callable
async def run(async_fn, /, option1=None, option2=None):
...
(This might even be typable using the PEP 612 draft notation? I'm not 100% sure.)
It still has the downsides of being hard to document, and I think the use of )(
may be highly polarizing, i.e. some people think it's totally fine and others will experience a brain reboot when they try to parse it. (A lot of people find higher-order functions unintuitive and awkward to think about. Also I'm pretty sure I saw Guido somewhere saying that he found anything that used )(
intolerable.) The )(
issue could be somewhat mitigated by making it foo.options(...).call(...)
, though I dunno what the best name for the intercalary method would be.
this would greatly enhance usability — especially for function declarations like:
async def request(method, url, *, pool=None, preload_content=False):