Closed njsmith closed 1 year ago
That's a fair argument against such a naming approach, but I think it needs to be balanced against the desire to implicitly communicate that these are new fundamental concepts that must be grokked.
Here's one:
async with trio.summon_waiter() as waiter:
await waiter.start(keep_drinks_full)
sip()
waiter.start_soon(make_poutine)
It seems to work both literally and metaphorically, hopefully in self-explanatory ways.
Should this issue still be open? If so, I have a drive-by suggestion borrowed from Inform 7: "scene".
The idea is to emphasize that the object stands for the time period in which tasks (actors?) do their work. Actors enter and exit, but the scene is not done until all the actors have performed their roles.
Other possibilities might be "shift" (short for work shift) or "meeting".
None of these are a perfect fit, but it might be good to explore other names for an interval of time.
I'd go with TaskGroup
.
Two reasons. (a), anyio
uses that name and asyncio
IIRC is moving towards that also. Plus you get to use tg
as the variable to store a taskgroup in, which is distinctive enough that it works. ("n" or "nu" doesn't work, and "nursery" as a variable or attribute name is just too plain long for my taste.)
More seriously, however, (b) yes this is a novel concept, but at the same time it is not, or at least should not be. Other languages start picking up on the Structured Concurrency concepts, and frankly this is how async code should have worked from the beginning.
Nurseries/taskgroups no longer are novel concepts, they {are | should be | are going to be} the "new normal". You get to name normal everyday ideas with normal everyday names (not cute new names that need to be explained); these concepts are not / no longer special. The names describe them and provide a way to think about them, and that's all there is to it.
Hey, I like "scene"! It's a name that any non-native english speaker like me easily understands and it conveys the time connotation pretty well together with that of "a space where part of the action (of the whole story) takes place", like in this definition (emphasis mine):
A subdivision of an act in a dramatic presentation in which the setting is fixed and the time continuous.
The only limitation is that "actors" outlast the scene... so it's not a perfect fit. But that's a problem that "nursery" already has/had with "children".
There's another name that now I see (with surprise) that was not suggested which has the lifetime and the "confinement" connotations, it's easy to remember and it's perfectly short: "cell"!
What do you think?
Mmmm, maybe it's time to make a poll in the Trio forum... or it's already "too late"?
Well, there is a topic for this. Not sure if it's better here or there. Anyway, I'm just passing by and this is up to someone else.
https://trio.discourse.group/t/the-terminology-bikeshed-thread/95
One complication for scene
is that it suggests that tasks are actors... and actors already refer to a specific and very well known approach to concurrent programming :-). As far as I can tell, you have actors when:
Actors are commonly contrasted with CSP, which makes exactly the opposite choices (task lifetimes are bound to their parent's lifetime, tasks are anonymous, message queues are bounded). And on all of these points, Trio sides with CSP... so using "actors" to refer to Trio tasks creates a lot of potential for confusion :-).
At some point I need to make a pronouncement here. But pronouncements take a lot of energy, which I've been short on lately, and I think there are a lot of higher priority decisions to make (like finishing nailing down our APIs for processes/channels/listeners/kwargs/etc.). So I'm probably going to let this ride for a while longer. (And hey, maybe the broader structured concurrency community will converge on something in the mean time.)
Having let this simmer for a while, I like TaskScope
and trio.open_task_scope()
. I have this slight nagging feeling that the analogy to lexical variable scoping isn't perfect, but I can't figure out why, so it might just be perfectly normal paranoia.
@scottjmaddox Because “nurseries” implement dynamic scoping, not lexical = static scoping?
Hmm, I would say that the lexical/dynamic distinction doesn't apply to nurseries at all. That distinction is all about which scopes you consider when resolving an implicit scope lookup, but nurseries are never accessed implicitly, only through explicit references.
@njsmith Oh. You’re right. I think I first thought (and said, somewhere around here) that nurseries implemented dynamic scoping months ago, and during all that time it didn’t occur to me that they don’t, in fact, scope stuff.
I’d like to withdraw my suggestion of SomethingoptScope, then.
But unless I’m gravely mistaken (again), nurseries do implement nested task lifetimes, so I’d like to propose SomethingoptLifetime or, if that sounds too Rusty, SomethingoptRegion. (But Lifetime feels clearer, given that we’re not talking about literal memory regions here.)
I like nurseries, the cuteness is moderately (subjectively) appealing, but this may lend itself to the larger appeal below, and it maintains with the analogy that it contains children (although breaks a bit as the child coro's entire life is in the nursery (death in nurseries! :)). The biggest appeal is that it doesn't have a namespace collision with the abstract mental stack dev's have e.g. 'Process' is a bad name it as it already has a vague, abstract definition and dev's could already have other uses in there mental and design model, in and out of programming; I have to process this file as input, this munge process, these sequential steps are a process, design process, etc... Or worse, if they're developing at a factory and the design spec has 'process' throughout, and they're writing mutli (posix) process code. Every time a dev types, reads or thinks process they have to disambiguate. A small extra mental tax.
Whereas 'thread' is wonderful (thanks Victor A. Vyssotsky!) it has a decent analogy; 'interwoven', 'light', 'thin', 'many together create something larger' but when you read 'thread' in code you know exactly what it is! It was coopted with email, sorry for the devs' of multi-threaded email apps.
Guess what the first threads were call? "TASKS" (OS /360 MVT) another terrible name that would be causing a collision on some of the suggestions already placed here, and if it stuck asyncio would have perhaps avoided using the name 'Task' to avoid confusion with what is now threads.
Working on a project with data streams and io streams I needed a name for a higher order 'data flow pipeline' at first I coopted and reused the name 'stream'. Halfway through I renamed it to 'creek'. I didn't like that it felt 'small and unperformant' but it removed a mound of confusion.
For this reason create names that unique and concrete first, and have a fitting analogy second. Nurseries are good in this way. Yes it's completely foreign at first, but as soon as you learn it's definition, which is necessary as a central concept to curio's async block model, you know it's that curio block thing, never any confusion.
As the nursery model is not specific to curio, perhaps a better fitting and widely adoptable name is out there to be found. Perhaps nurseries are fine, and just have to stick. Perhaps 'thread' was considered cute or unfitting and had opposition in 1966, but I'm glad it stuck. Likewise for port/stream/stack/queue. All of these were coopted from something physical and are concrete. It's nice when the analogy is very well aligned, but once it's adopted it feels correct.
I wish that abstract names like process, job, and task were left out of the stack of OS's/languages/frameworks so devs could use them without collisions.
@njsmith, you speculated one year ago that a name change could become practically impossible within 6 months or so. Is there still any merit in debating the best/most appropriate name? (If so, do you have an updated estimate on how quickly it would have to happen, if it does?). I know you mentioned two months ago letting it ride "a bit longer" while you figure other things out, but I agree more with your initial reasoning that such a central part of trio should be stabilized ASAP
My thought is that this issue has been open for a year, nothing has come up in that time that's obviously better than nursery (and better enough to justify the cost of switching) and I feel it's increasingly unlikely any such name will be found (I personally favor taskgroup but not enough to argue strongly for it). On the flip side the cost to switch is only increasing.
We might be better off closing thus issue and devoting our collective mental energy in areas with a better chance of having a return on that investment.
to be honest I was a bit irritated at first by the name open_nursery but now I am totally fine with it. Anyway, I believe in good names and if you are going to rename this you should IMHO use "split_control" since it is what the context manager does:
async with trio.split_control() as split:
split.start_soon(sum_numbers, 0, 50000000)
split.start_soon(sum_numbers, 50000000, 100000000)
This discussion has probably run its course.
But after reading over the thread this morning I had an original idea that I think communicates a lot of things at the same time, and also remains true to the whimsy of nursery
:
asylum
Similar to a task in an asynchronous part of a program: you generally don't leave the asylum, and lot of crrraaaazy things go on inside.
Should this still be open?
If so, I have four things to say:
When I first encountered Trio a year or so ago, I found the name awkward to fit for like a literal second, but then I just internalized it and moved on.
For me it was a mentally cheap operation to mutate the meaning of "nursery" in Trio's context from "somewhere that children live" to +=
" and die".
(Actually I did at least two parallel ways of making it make sense in my mind. The other one that I currently remember is: a child coroutines is always a child, so it can never leave the nursery.)
I maybe experienced subtle brief background irritation at the choice to use a name that was more "cutesy" than it was self-explaining, but once I understood it I could see it as "close enough" to self-explaining and felt fine with it.
My first exposure to structured concurrency was Sustrik's libdill and blog articles, not Trio, so that probably influenced something, but I do not remember debug traces of that cognition.
I am trying my best to empathize that some people actually feel the extra effort or tedium of typing "long" identifiers starting around seven or eight characters (which is just mind-boggling to me, but I'm trying to stay mindful that experience is relative).
However, I have learned to not consider it even acceptable - let alone good - to optimize code for writing convenience, unless uncompromising priority is given to optimizing for reading.
Code must be optimized at every turn to guide as many future readers as possible to a correct decode. This includes not only its big-picture intent and its business logic but also of every assumption, expectation, and implementation detail which is at all relevant to it working right.
That is not to say that short identifiers are bad - all else being equal, shorter identifiers are strictly better - in fact more readable - than longer ones.
But those are still optimizations for reading. They are emphatically not optimizations for writing.
Rarely have I seen optimization for shortness that did not neglect some aspect of what goes in to making code readable - including being informative, self-describing, and easy to search, navigate, and hop around.
Optimization for writing convenience was almost always a key cognition flow causing that neglect.
If you want or need to type short names for writing convenience, I truly wish you the best experience with that, but please don't inflict code optimized for your writing experience onto other people to read.
That cost should be borne by automated tooling, once, when the code is written or saved or committed, not by every reader of the code who comes later.
Thankfully I'm fairly confident that @njsmith will not trade readability/clarity/self-explanation/guiding-to-correct-interpretation for writability, based on what I've seen of his work so far.
fork
?I haven't decided if this is actually better, but I like it, and no one has suggested it:
async with trio.open_fork() as fork:
fork.start_soon(foo, ...)
# or maybe even just:
fork.soon(foo, ...)
...
The biggest advantage is that the code that a nursery covers is a fork in the execution paths - the spot where a linear call flow forks into a call tree.
The biggest disadvantage is that the word "fork" is already used in Unix-y programming for forking a new process.
Now I have rarely needed to refer to that as "the fork" or "a fork" - in those cases I usually say "child", "forked child", or "forked process", or naturally use the verb "fork" in a phrase like "the process forks" or "forks a process" - so I don't think the collision is too bad. Even in a codebase that does both, the terminology naturally lends itself to disambiguation: "forks a coroutine", "the coroutine forks a new process", "by forks a coroutine or forks a process?", etc.
It's also not ideal for searchability or for signaling that there is a new concept, but a good point was made that as structured concurrency gains ground, it will just be a standard concept.
"Fork" does not imply the boundedness that structured concurrency makes its central tenet - forked processes outlive their parents all the time. But if unstructured concurrency goes the way of the goto
, we won't need to disambiguate that. The "for loop" term does not try to encode or hint at the exclusion of unstructured jumps into or out of it - it is simply a feature of a language to make that impossible.
Anyway, I like start_soon
a lot already, but I can defend fork.soon
, because as a verb, "fork" already implies starting the execution path, and "soon" retains the important reminder that it will be scheduled later, once permitted.
(I also considered: fork.branch
but people above mentioned that "branch" already has a strong and nearly universal association in programming to mutually exclusive execution paths; fork.prong
but it loses the "soon" reminder; fork.tooth
but I think people are less likely to recognize it than "prong"; and fork.tine
but that's just even more obscure.)
Also, the physical object that maps to the word, a fork, has a clearly defined end. A fork's prongs stay attached to its body - it takes some effort and force to break a fork's prong off its body, and it's an obviously bad idea which leaves you with a broken fork.
We could even start calling unstructured concurrency as "fork breaking": like
"wow that code was so hard to follow, it had broken-off forks everywhere" (that just sounds bad even if you don't know what it is, which is a good property for stuff like this), or
"hey Bob, good code, just one thing to fix before we can merge: you broke off one of the ends of the fork over here, let's think of a way to do this with structured concurrency instead."
Honestly, if you just left it as "nursery", I wouldn't be bothered.
A key advantage that I haven't seen explicitly mentioned is that it is searchable - that nursery does not collide with any existing programming term is not just helpful to human cognition: it also helps searching, whether in some code with a dumb text find or in a search engine, etc.
I mean sure, you can add "structured concurrency" to a search engine to narrow things down, but that's only if you already know what to look for. Imagine an unfamiliar newbie trying to look up "programming nursery" after hearing about the concept vs "programming {split,fork,tree,task group}" or whatever.
I was thinking more about the above fork
suggestion, and I've updated my last comment with some of these thoughts, but I want to elaborate on some of those additions:
"Fork" does not imply the "boundedness" that structured concurrency makes its central tenet - forked processes outlive their parents all the time. But if unstructured concurrency goes the way of the goto, we won't need to disambiguate that. The "for loop" term does not try to encode or hint at the exclusion of unstructured jumps into or out of it - it is simply a feature of the language that such a thing is impossible.
I think the key thing here: "nursery" is optimized for a world where structured concurrency is new, a world where it needs to be discovered and searched for and distinguished because it is beset on all sides by unstructured concurrency.
What would we name it in a world where structured concurrency is the way? If unstructured concurrency was simply unavailable or even unconceivable to most programmers, even treated with knee-jerk bigotry by some in its more limited forms, like the modern goto
?
I'm reminded of some of the discussions I've seen on @njsmith's "Notes On Structured Concurrency". I've seen at least one person respond to it by saying "well, a function call is still a jump to anywhere", not seeing that it's a distinctly more limited jump that can be expected to go only to well-defined entry-points and to always come back out of the same spot. Presumably because they've never had enough reason to mentally step through code able violate that expectation, so their mind simply does not automatically conceptualize such code flows vividly and clearly enough for them to fully connect and feel the weight of the comparison.
If the world was similar with structured concurrency, does that change the way we think about what the best name should be?
Would my idea about optimizing for searchability make as big of a difference? Would the need for a distinctly new term make as big of a difference? Probably not.
At that point I wouldn't even worry about "'fork' does not disclaim the association with how a forked child process can outlive its parent", because by then operating systems and every library would provide robust primitives that prevented child processes from outliving their parents too, and "oh actually a forked process/thread/coroutine can outlive its parent if you [old deprecated API]... it's a backwards compatibility thing from back in the day, in a modern system you'll never run into it" would be the only way a typical person even learns that it is possible.
(Aside: I think we can actually implement structured concurrency at the OS process level already, using things like process groups and sessions, which have existed since old UNIX days on every UNIX-y system. It's just that there are no libraries or CLI interfaces in widespread usage that do this, as far as I know.)
Anyway.
When we get to that point, "nursery" will stand solely on its association to the concept of children, and from that word the association to the concept of child coroutines.
Let's look at it this way:
What do you call the block of code inside the async with trio.whatever() as whatever2
?
Now does that term still work if it was a built-in language feature, with dedicated syntax, like whatever { /* my coroutine forking code here */ }
?
One of the reasons why I think I'm drawn to my "fork" suggestion is that it generalizes. I spend enough time dealing with multiple programming languages that I've gotten used to looking for terms and wording habits I can reuse regardless of language.
If I needed to refer to that spot of code verbally, without referring to a specific implementation's word choices, I think I'd call that the "fork block" or even just the "fork" - if I needed more context I'd say something more verbose or specific once per conversation to pin it down, and then thereafter use "fork".
Why? Because I could keep reusing that term no matter what language I switched to, or even if the underlying way of concurrency was coroutines or threads or processes.
Now maybe that's a rare and irrelevant use case. I usually do use the terminology of the language/library/whatever I'm dealing with in context.
Also, I just realized (or more like remembered or even more like consciously pinned down): even in the absence of a culture where structured concurrency is the norm, a "fork" brings up a nice physical analogy, which I've gone ahead and edited into my prior post but which I'll quote here for linear reading flow:
Also, the physical object that maps to the word, a fork, has a clearly defined end. A fork's prongs stay attached to its body - it takes some effort and force to break a fork's prong off its body, and it's an obviously bad idea which leaves you with a broken fork.
We could even start calling unstructured concurrency as "fork breaking": like
"wow that code was so hard to follow, it had broken-off forks everywhere" (that just sounds bad even if you don't know what it is, which is a good property for stuff like this), or
"hey Bob, good code, just one thing to fix before we can merge: you broke off one of the ends of the fork over here, let's think of a way to do this with structured concurrency instead."
That analogy will be weaker for people who deal enough with Unix-y programming to strongly associate "fork" with splitting processes, but the terminology is still unambiguous per my earlier remarks, and Windows-y programmers or programmers who stick to higher level APIs will be untainted and free to benefit from the association/analogy/visual.
Anyway, returning to my point:
When we get to the point that structured concurrency is the norm, if "nursery" is not enough of a stable optimum, it will simply get evolutionarily outcompeted. People will start calling it something else, whatever is more mentally easy and accessible, until eventually one of those alternatives catches on to the point that its usage hits critical mass.
For example, we already do use a general language-neutral term, not for this exactly, but in the general vicinity of it, a more general concurrency concept: call tree.
It also implies boundedness, the same way that "call stack" does - a child function can't just re-parent itself to a different spot in the call tree just like it can't re-parent itself up the call stack (semantically, anyway - tail call optimization or trampolines may avoid a traditional call stacks/trees at the implementation level).
But it doesn't map as well to Trio's API "shape"... best I can think of right now is something like:
async with trio.call_tree() as call_tree:
call_tree.branch_soon(foo, ...)
# or maybe
call_tree.fork_soon(foo, ...)
# or maybe even `as call` above and then:
call.soon(foo, ...)
And that's not ideal. Furthermore each nursery only represents a branch point or... ahem, fork... in a program's call tree, not a full actual call tree, because it does not contain or represent any of the nurseries opened below/within it.
Of course "fork" was my attempt to extract the same meaning that "branch" gives in the context of talking about call trees into an independently stable form.
So we can already see the process in action, which once again finally brings me back to my original point:
At some point, a new term will catch on, if "nursery" isn't "good enough".
The term nursery can live on past that point for a while in code which keeps using Trio or other libraries which have the term wired into their API, and those working with such code will get used to the exchange "what's a nursery?" "oh it's just this library's term for {task group,fork,split,call tree}", until enough pull requests or issues or forks or alternative libraries build up expressing humanity's want for a different term.
And that's probably okay. It's okay to just wait for that, wait for "humanity to decide", wait for natural selection to bring the most currently human-friendly term to the most widespread usage.
It might not get us the best terms, but it'll get us terms, whether or not we try to come up with our own. And usually the terms are alright. Usable enough. Precise enough and unambiguous enough in context, etc.
On the other hand, trying to talk about it is us participating in that process - picking a better word is influencing the process. In theory, a convincing argument here, or a change in naming in Trio, could help spread a given terminology that needs just that edge to win.
Which is fine - but the question is:
Are we at that point yet? Because if we're past the point where structured concurrency itself has reached memetic and cognetic critical mass, then we need terminology that works best then. Until then, we might as well stick with nurseries, because I think it has clear advantages in searchability and unfamiliarity and so on, and just wait and see what catches on. Once something does, it will eventually be obvious, assuming Trio is still flourishing and relevant to the wider programming community.
And if the change has to happen later, backwards compatibility is not exactly hard - for example if open_nursery
gets renamed to fork
, just open_nursery = fork
at the bottom of the top-level trio
import, and if we don't want to carry that around indefinitely (I know, it is a heavy burden) we have SemVer for that, and Python is so dynamic that someone maintaining a legacy codebase (but for some reason not just pinning their pip
dependencies to specific versions even though they're unwilling to go through and update their code) can just do trio.open_nursery = trio.fork
just under their import trio
.
In the meantime, I think where it really shakes out is conversational usage - how easy it is to say, understand, remember, etc, and in so far as misunderstandings matter, how resistant it is to nuance decay, double illusion of transparency, etc.
Now that I've thought of "fork" and "fork breaking" and so on, I'm going to be test-driving those when talking specifically about structured concurrency or code spots that use it or fail to use it (as opposed to places where "call tree" is sufficient), and seeing if I run into problems or misunderstandings with it.
I encourage others to do the same. I'd say "may the best memes win", but the "best" memes will win, no may about it.
On further thought: any terminology needs to fit nicely with the "you have to pass the nursery as an argument to the coroutine to create children that outlive it".
Nursery is fairly nice for that:
"nursery" is exclusively a noun - unlike "fork" it doesn't pull double duty as a verb.
Compare: "the nursery", "the fork", "the fork object".
(Poor "split" pulls triple duty as an adjective, so even adding a noun to it to make it unambiguously a noun doesn't save it from sounding weird: "the split object".)
More importantly: you can put things into a nursery - namely children.
Why does this matter so much?
Because a key feature of the nursery primitive is passing a nursery through code to let that code put child talks into that nursery.
I think per the usage point above, we should check all alternative names against that pattern. For example, these seem to flow well
"oh you need to pass the {nursery,lifetime,scope,task group} you want the task to live under through to where you want to start it from", or
"the reason we want to explicitly pass a longer-lasting {nursery,lifetime,scope,task group} instead of implicitly is so we can know when we might be creating something that has effects or leaks beyond the scope of the block".
Notably, my earlier suggestion of "fork" really gets clumsy around here, and I think this reveals it as deficient in a way that those other alternatives aren't.
Edit:
I mean I guess "you need to pass the fork object you want the new coroutine to fork from" and "we want to explicitly pass an an outer fork object so we can make code that breaks off forks externally visible" kinda works, but it took me more effort to proactively think it up, which isn't a good sign.
The word "branch" should not be reserved for non-concurrent branches, no matter how strongly associated it is with that right now.
A branch is a branch is a branch, regardless of whether it is traversed in parallel or exclusively or not at all.
The call tree visual/metaphor is very valuable for thinking about structured concurrency - and trees have branches!
We should fight for an equal share of that word.
So there are exclusive branches, and there are concurrent branches.
Structured programming brought order to exclusive branches. Structured concurrency brings order to concurrent branches.
The more I think about it, the more "split point" feels right as the term for what a nursery is, maybe not for Trio but definitely for structured concurrency as a whole.
It ties really well with the call tree metaphor and visual:
If I'm teaching the idea of structured concurrency to someone, or at least the passing nurseries part, I want to set up the idea and visual of a call tree anyway - this terminology comes for free.
So "split" is the action, "split point" is the object.
The addition of "point" does a great job of making it a noun: it is natural once we introduce split point objects as references to a specific point where the call tree splits, existing for the purpose of letting other code split more concurrent branches from the same point.
It's short. "split" is one syllable, five phonemes, five characters. (Any complaints about "while" being too long?)
It's free. As far as I know, the word "split" is not used for any one established programming concept, and no mainstream language uses "split" as a language keyword. Certainly the term "split point" is not in use (except for drill bits, apparently).
Yes "split" is generic (but "split point" isn't, and that compensates), but I'm imagining what this should be named in a world where it is a fully established fundamental programming concept, equal with if
and while
, taught in introductory classes, and in that world, it should be a generic word, because if we do this right, soon enough breaking off a concurrent branch will be just as inconceivable as a goto
from inside one function into the middle of another.
I am more excited about this as a generic structured concurrency term proposal than I am for it in Trio - don't care if Trio just keeps calling them nurseries (see my first comment), but if I had to try to apply this to Trio, I imagine something like this isn't too bad:
async with trio.split() as split_point:
split_point.start_soon(foo, ...)
# People who want short could do `as sp`!
Also in languages like C or Go, a split point can be represented by a pointer to an object of a type called "split", and that feels like the right alignment of naming between implementation details and abstract concept.
P.S. The entire time I was writing this comment, I kept thinking "I feel like maybe someone said this before and maybe I was subconsciously influenced by it?" - so just before hitting submit I just searched the thread and apparently I've just reinvented the terminology that @njsmith first proposed in the very first comment two years ago. But I had totally forgotten, and arrived at it again from a completely different direction: while thinking about how I would add structured concurrency as a language feature to something like C. So I think there is something more universally true to it.
Thing is, when I first read @njsmith's "split point" suggestion I didn't like it either, and did not think it fit Trio's API shape well.
To the best of my ability to remember.
I clearly didn't like it enough, because I completely forgot about it consciously.
But this also reveals that I did not dislike it strongly enough, and did not identify any bad enough problem with the term itself, because I would have remembered that by now.
@smurfix mentioned it not signaling a new, unfamiliar concept strongly enough, and I agree with that as a circumstantial reason, but I think as structured concurrency gets going, that rapidly goes from being a downside to being an upside.
When structured concurrency was new and unfamiliar, a new and unfamiliar and opaque term was good. Once it is established and normal, a simple term that implies it is the simplest and most familiar and most intuitive concept in the world, like "if", is more fitting.
Anyway, I think it is very notable that I did not like or remember the terms "split" and "split point" initially, and did not think they were a particularly great fit for Trio's API shape, but then rediscovered them while thinking about the best terms for a completely different language.
(Reminder: I'm not trying to push for removing the name nursery right now - see my first comment, and also my previous one, and also remember that it is trivial and not necessarily a bad thing for Trio to keep the nursery name for backwards compatibility when it does decide to adopt "split" and "split point".)
I had occasion to try using both "split point" and "fork breaking" in conversation the other day, when trying to talk about structured concurrency, with a call tree diagram on a whiteboard.
Two very noticeable observations:
"branch point" (which I think @njsmith also suggested in the same early post he suggested "split point" in) is better than "split point", unless someone else has anecdata to the contrary:
I keep slipping into saying (and typing) "switch point" instead of "split point", and don't even notice it at first.
Not sure if there is a more fundamental reason beyond the term just not fitting into the patterns of words my brain is used to, but I misspoke so consistently I just announced I was switching to "branch point" mid-conversation, and it went smoothly from there, without loss of clarity.
"branch breaking" was much less "out-of-the-way" and less awkward to say than "fork breaking".
It felt like introducing "fork breaking" spontaneously into the speech flow was adding a new term, and one which due to unexpectedness might not even immediately phonetically parse to someone.
I could tell that in order to mitigate this, I would have to say something like "because alternatively we could call a branch a fork", which in context just seemed like it would be an arbitrary and unjustified word-hop from "branch" to "fork" for no good reason to the audience, and thus not particularly memorable, since memory is so dependent on things systematically sensibly fitting in together.
So after trying to use "fork breaking" once, I just accepted the natural flow of statements like "or in unstructured concurrency, the branch can also be broken off".
I've also found that the distinction and terminology of "concurrent branch" vs "exclusive branch" takes me all of one sentence to establish and was accepted as self-evident without any resistance or confusion the one time it came up in conversation where I could observe the reaction.
The moment I accepted "branch" as the more natural term than "split" to use, I started also finding myself wanting to contemplate whether or not there is some deeper value that could be extracted by reifying the branch point of an exclusive branch (the moment an if
or while
or switch
statement opens) the same way that reifying the branch point of a concurrent branch into a nursery has produced value.
This is getting a bit too esoteric, some weird language design theory direction abstracting beyond structured concurrency as a whole even, but I find myself wondering what we can make of this.
Because it's very interesting that structured programming manages to make do without reifying either the branch execution or the branch point, and yet every structured concurrency design I like reifies either the branch point (Trio's nursery) or the branch execution (libdill's coroutine handle).
One of these days I'm going to just make an account on the structured concurrency forum @njsmith started and relocate all these thoughts to there, but for now streaming them here is just much more accessible of a workflow for me.
The above having been said, I think split
is still the more right language keyword than branch
, the same way that if
and switch
are better language keywords than branch
.
Structured Concurrency's nursery is Structured Programming's jmp_buf
.
This analogy might not stretch far enough to be useful, but here's what I mean by:
Structured Concurrency's nursery is Structured Programming's
jmp_buf
.
I said in one of my last comments that since modern structured concurrency had reified the concurrent branch point (nursery), it was interesting to think about if there was any value to reifying structured programming's exclusive branch points.
After I thought that, I kept trying to figure out why structured programming managed to make do without any explicit reification of branch points.
I think this is why. It did exist, at first, as a partial controlled escape hatch in languages like C. Like a nursery, a C jmp_buf
has to be set up higher up in the call tree, and it had to be passed to the code that wanted to branch off of it.
Of course, we could implement everything (if
, while
, for
, switch
, break
, continue
, return
, raise
, yield
, etc) with some clever combination of longjmp
, function pointers, and mutable state.
But we discovered these narrower limited and common variants of code jumps, and built them into our languages. Probably something like that exists for structured concurrency. Certainly there is a "shape" difference between commonest case of creating a new nursery to branch off some concurrent stuff at that spot of the call tree, and the less common case of passing around a nursery from elsewhere so that code in one part of the call tree can branch off of a point elsewhere on the tree.
The place where this analogy might break down to the point of uselessness is that maybe the difference between the reified concurrent branch point (nursery) and primitives like split wait
and split race
and whatever else we discover might not be big enough to justify an almost-vanishing of the reified branch point, but we already empirically know that the difference between the reified exclusive branch point (jmp_buf
) and primitives discovered for structured programming is big enough.
Anyway, that's just what I've been mulling on, and it connects to the question of "should we rename nurseries" because if this really is a jmp_buf
analogue, then the exact name won't matter at all in the historical big picture that's coming, because a few less flexible primitives will fill most usecases.
I think the important way to check if this is where structured concurrency is going is to look really, really hard at the cases where the "pass a nursery around" actually occurs, and think really, really persistently about how it might be done elegantly with only more limited local-only concurrency primitives like wait
and race
, or what primitives would be needed to do it.
I think this is important to really try to do, because when we can see the low-level commonality, the fundamental principle, it is easy to keep mentally reaching back to that as "well obviously this one thing covers this too". From the vantage point of seeing the one elegantly unifying thing, it can be a lot harder to find or even look for the few less unifying things that are good enough for almost every case. wait
and race
are definitely not enough, but they seem like they cover about as much ground as for
and if
- maybe structured concurrency just needs a couple more structure-breaking primitives like return
/continue
/break
or raise
/except
or yield
.
Conversely, it still seems worthwhile to keep doing the opposite - looking for if there is some way for a reified exclusive branch point to elegantly and usefully replace all of those structured programming primitives. If we find such a way, it might inform the nursery discussion in the other direction, by revealing that nurseries really are here to stay as the primitive of structured concurrency.
(By the way, if anyone wishes to quote or discuss the comments I make here in other places like the Structured Concurrency Trio subforum even if I am not there to participate - feel free, I encourage it.)
Well, this thread has been going since mid 2018, so I doubt nurseries are every going to change name.
If it ever does, I'll add my voice to the crowd asking for something succinct and boring like trio.scope
, which is clear enough, short, has no surprise, it's already wildly used and known for communicating boundaries and lifetime.
I like the nursery
term. Modulus the GC ambiguity, it is unique, which is what you want for a new programming concept.
I also like the split
term. It conveys the semantics in the name itself. Something that nursery
doesn't quite achieve IMHO. I concur with the others about split
being a bit overloaded already.
I propose rift
. I find that it both:
split
does.Anyhow, maybe it's a moot point at this point in time. nursery
works.
That's my two cents.
How about mutual
as a name ?
Edit: I have no horse in this race, nursery is fine.
Edit 2: I think its not a bad word, as cancellation is mutual between them.
They have a mutual agreement to continue until any one is cancelled then all are cancelled.
Or it can be short for "mutual cancellation" or similar.
At this point, is there any reason to keep this open? Nursery seems fine and there is no ambiguity there. Also, over 2.5 years after the announcement, I think it will be extremely difficult to change this now. @njsmith
Pretty much more for the 'group' or 'band' naming (after all it's about a 'trio').
Can be ungrouped or disbanded... A 'session' or 'event' can be canceled...
Not as fancy as 'nursery' but sounding more professional imho.
A little over a year later, my position remains:
@njsmith I recommend that you simply make an executive decision, close this issue, and move on. I would very much like to see Trio move forward and become widely adopted. Instead, it seems to have stalled, with non-critical issues like this hanging around and with no decisions in sight. Please don't let the perfect (perfect decision, or perfect consensus) be the enemy of the good.
+1 to @efiring's argument.
I'm a newcomer to structured programming/trio (I just learned about it last week) and wanted to share my thoughts.
After recently pair programming and quickly going through the docs, I didn't understand the significance of a "nursery". In my mind, I just thought of it as some quirky structure and weird object name trio required; and after getting used to Twisted's peanut-butter-jelly-reactors, I didn't pay much thought to the weird name).
Only after reading the blog post and seeing this image, did the significance finally click:
(aside: is this three-pronged graphic the inspiration behind the name 'Trio'??)
When referring to the point in time (regardless of which name we use for it "branch", "fork", "split", etc.) the novel idea Structured Concurrency brings is that this point in time is necessarily bounded with a corresponding end/join point. (And with Python's context managers, this represents a concurrent block.) So when teaching these concepts, it makes sense to say things like:
fork
syscall doesn't enforce children are joined, so it can be considered unbound, unlike in structured concurrency". Upon reflecting on these previous messages:
Collections of threadlike things bound together at both ends are...
"Fork" does not imply the boundedness that structured concurrency makes its central tenet..
We can simply rely on the innovation inherent in structured programming and use "bound" as an adjective. Therefore I would amend what @mentalisttraceur said with:
For what it's worth, now in 2022:
TaskGroup
.StructuredTaskScope
Neither of these names are very special, and that's probably the point.
It's like @smurfix said:
Nurseries/taskgroups no longer are novel concepts, they {are | should be | are going to be} the "new normal". You get to name normal everyday ideas with normal everyday names.
As humanity coalesces on a new name and Structured Concurrency becomes mainstream, at the very least, it'd be good to document that a nursery is Trio's name for a Task{Group,Scope}.
Python 3.11 uses the name TaskGroup: https://docs.python.org/3.11/library/asyncio-task.html#asyncio.TaskGroup
If we rename at all I think we should use this name.
It's now been almost five years since njs said:
Should we rename "nurseries"? It's something that keeps coming up, and while it would be somewhat painful to change now, it'll be pretty much impossible within the next 6 months.
and, well, we haven't changed it. I'm going to close this issue, because I expect any eventual renaming to be a much more focussed discussion of "(should / how do) we rename to match asyncio and anyio?"... but I'm not in any hurry.
OK, let's do this.
Question: Should we rename "nurseries"? It's something that keeps coming up, and while it would be somewhat painful to change now, it'll be pretty much impossible within the next 6 months... so I guess now is the time to have a discussion, and if we decide not to change it then this can at least become the canonical issue that we link people to when they ask about it.
The most common version of this complaint is something like: "ugh, nursery is so cutesy. What about
TaskGroup
, orWaitGroup
, orTaskScope
, or something like that?" I don't find this version of the complaint compelling, so let me start by explaining why that is.Names like
TaskGroup
, orTaskScope
are excellent names for ordinary classes, the kinds of name we invent every day. For that kind of class, you want something that's descriptive, because you have a lot of them and people don't spend much time with any given one. And of course you want to follow the conventions, so the camelcase itself is informative: even if you aren't familiar with the particular class in question, you at least can tell at a glance that it is a class.Nurseries, though, are a different kind of thing entirely. They're a new fundamental concept: you have to learn them up front, like you have to learn about
for
loops, and functions, and the difference between an object attribute and a local variable. Once you've used them for a few days, they recede into the background as part of your basic vocabulary. And they're just different from the concepts most people already know, so trying to give them a familiar descriptive name is actually misleading: you have to look them up and learn them. So, I want a name that feels like a primitive, like "for" or "function" or "variable" – something short and lowercase. And if anything, something that's a bit opaque and unfamiliar is probably better, because it signals "hey, new concept here". You can't have a lot of concepts like that, but nurseries are carrying an entire programming paradigm on their shoulders, so I think 1 new word is within our budget.And regarding it being "cutesy": This doesn't really bother me, given that (1) this is a domain where "reaping orphan zombie children" is already considered totally ordinary technical language, and (2) after you use nurseries for a few minutes the name stops feeling cutesy, just like you stopped noticing that "trees" have nothing to do with botany, "functions" often don't function (little debugging joke there), "threads" have nothing to do with textiles, etc. So like... ok, it's a bit cutesy but whatever, people will get over it. And being cutesy actually has some upside: the whole joke about "a nursery is where child tasks live" probably does help it stick in people's minds. OTOH, I'm not like super attached to having a cutesy name either; I think it's mostly a neutral attribute.
So that's why I'm convinced that "nursery" is better than
TaskGroup
or similar.But...... when talking to @graydon about this today, I did have some doubts about "nursery", for two reasons. The first is, he pointed out that "nursery" is also used as jargon for generational garbage collectors (to refer to the youngest generation, which often gets special handling). So a language implementer might ask "is this task in that nursery?", and "is this
Task
in the nursery?" and those are two completely unrelated questions. Fortunately it's mostly implementers who encounter the GC version of nursery, but still... this is a legitimate collision.And the other is, I realized that the name is actually a bit less apropos since #375 landed and we tweaked the nursery semantics. Originally, I really wanted to emphasize the idea of "supervision", because we had the whole "parenting is a full time job" rule where the parent task had to park itself at the end of the nursery block or exceptions wouldn't propagate and stuff. Now the supervision part has dissolved into becoming an implicit part of the runtime, and doesn't really exist as a concept at all anymore – it's just, like, exceptions propagate themselves, what is there to talk about. The code inside the nursery block is just a slightly-special child. So that part of the metaphor has become a bit weird.
So here's another idea: maybe we could use a name that emphasizes that this is a place where your call stack splits/branches/words like that. Like,
async with trio.open_split()
, oropen_branchpoint
, or something like that? With the metaphor being that first you make a note "here's a point in my call stack where child tasks might split off", and then callingstart_soon
actually attaches a new branch there. I'm imagining like a hinge or a hook or something.Anyway, that's just one idea – if y'all have other ideas I'd be interested to hear them, and ditto any thoughts on how the "nursery" name worked for you as either a newcomer to Trio or as you've become more experienced.