Open nmlorg opened 2 days ago
Initial implementation (of the first approach (deduping in process_awaitables
)), running:
async def async_fetch(delay):
await SleepToken(delay)
sock = socket.create_connection(('httpbin.org', 80))
sock.sendall(b'GET /status/200 HTTP/1.0\r\n\r\n')
await ReadToken(sock)
return f'async_fetch({delay})'
async def async_read():
create_task(async_fetch(.1))
create_task(async_fetch(.2))
return 'async_read done!'
ret = sync_await(async_read())
log('ret =', ret)
using:
class GatherWrapper(_BaseWrapper):
def add(self, awaitable):
task = _BaseWrapper.wrap(self, awaitable)
self._awaitables.append(task)
return task
finalizes as:
0.758 [finalize:88] self = <__main__.CoroutineWrapper object at 0x7881ad16d690> value = async_fetch(0.2)
0.758 [step:206] self = <__main__.GatherWrapper object at 0x7881ad16d600>
0.758 [finalize:88] self = <__main__.GatherWrapper object at 0x7881ad16d600> value = ['async_fetch(0.1)', 'async_fetch(0.2)', 'async_read done!']
0.758 [main:285] ret = async_fetch(0.1)
because _BaseWrapper.wrap(self, coroutine)
calls CoroutineWrapper(self, coroutine)
:
https://github.com/nmlorg/nsync/blob/9c88731e19283789f01708afd2ea4445c4a69bf7/examples/async6.py#L153-L160
That initial manual step (self._send(None)
) causes the two tasks to be scheduled before the actual top-level coroutine is added to top_level
.
This initially also caused a problem in sync_await
, where:
def sync_await(coroutine):
dummy_gather = GatherToken(coroutine)
top_level = _BaseWrapper.wrap(None, dummy_gather)
create_task.add = top_level.add
didn't work because GatherWrapper
wrapped coroutine
, and CoroutineWrapper
tried to immediately step it, so create_task(async_fetch(.1))
was evaluated before create_task.add
was assigned. I ended up using:
def sync_await(coroutine):
top_level = GatherWrapper(None, ())
create_task.add = top_level.add
create_task(coroutine)
which ensures coroutine
is not wrapped (and so not initially stepped) before the create_task
machinery is hooked into top_level
, but I don't see an obvious analog for the misordering problem :/
For what it's worth:
async def async_read():
create_task(async_fetch(.1))
create_task(async_fetch(.2))
await SleepToken(.05)
create_task(async_fetch(.3))
return 'async_read done!'
finalizes as ['async_fetch(0.1)', 'async_fetch(0.2)', 'async_read done!', 'async_fetch(0.3)']
, so it's not just changing return top_level.value[0]
to return top_level.value[-1]
or anything.
I suspect it's going to end up being that we need to not step in CoroutineWrapper.__init__
, and instead leave that initial step to the first pass of the process_awaitables
loop. It's just... right now it feels like we're doing something special (step with a dummy value rather than the finalized value of an awaitable) in a special place (wrapper initialization), but moving that into the loop would be doing a one-time special case in a repeated area, which... feels wrong :slightly_frowning_face:.
(For what it's worth, I haven't come up with any way to make create_task
work without some kind of global state, which gets undone/redone every time the executor is instantiated (and so running multiple executors in different threads of the same interpreter would step on each other). This really defies a part of Python I really liked: self
was an explicit argument to method calls, there was no magic this
in the global namespace that meant something different in different places.
Maybe that dummy step should have actually been used to pass the executor into the running coroutine, i.e. coro.step(None)
should have been coro.step(loop)
, where that value was made available to the async function somehow, and all the functions like asyncio.create_task
would be methods on whatever was passed in. So:
async def mycoro(a, b):
asyncio.create_task(othercoro(a, b))
mycoro(1, 2).send(None)
would have been:
async def mycoro(loop, a, b):
loop.create_task(othercoro(a, b))
mycoro(1, 2).send(asyncio)
or maybe:
async def mycoro(a, b, loop=None):
loop.create_task(othercoro(a, b))
mycoro(1, 2).send(asyncio)
or even (since it is using coro.send
):
async def mycoro(a, b):
loop = await None
loop.create_task(othercoro(a, b))
mycoro(1, 2).send(asyncio)
Anyway...)
I think the last major pieces are background tasks and task groups, both of which can be thought of as adding an awaitable to a gather while it's running/after it's been constructed — that is:
is essentially:
just with
token1
andtoken2
being added to the gather whilemain()
is executing.In async1–6, the "top level" run by the executor (
sync_await
) is a coroutine instance (in async5–6 it's wrapped inCoroutineWrapper
): https://github.com/nmlorg/nsync/blob/9c88731e19283789f01708afd2ea4445c4a69bf7/examples/async6.py#L248-L257I think the executor should wrap the raw coroutine instance in a
GatherToken
, then wrap that in aGatherWrapper
, and "run" that. So:would become something like:
That top-level
GatherWrapper
would be made available globally somehow, such that a globalcreate_task(awaitable)
would just appendawaitable
totop_level.awaitables
.Since you can (and often should!) await background tasks, this would mean awaitables might legitimately start showing up in multiple places:
being:
during the second
await
.So:
sock1.recv()
twice, andprocess_awaitables
to finalize multiple awaitables for a single read operation, or_BaseWrapper.wrap(awaitable)
track and reuse wrappers and_BaseWrapper.finalize
to signal multiple parents.in which case
process_awaitables
will have to figure out that two entries inwaiting_for.readers
point to the same socket, then callreader.finalize(data)
for both of them. So you'd need to dedupe:and change:
to something like:
― or ―
in which case all of the magic would be in
_BaseWrapper.wrap
, presumably crawling throughtop_level
recursively looking for something that already points toawaitable
(meaning, among other things, all of the…Wrapper
subclasses need to actually hold on to references to their tokens!) and just adding the new parent to its_parents
(a list, versus the current_parent
). I think everything else would stay the same, butfinalize
would need to callstep
on each parent, which might complicate things (need to make sure a step doesn't cause an awaitable to finalize before it gets stepped a second time; I'm not sure if that's actually possible yet...).The second approach (
_BaseWrapper.wrap
deduping, with wrappers having multiple parents) feels more correct than the first (deduping sockets/calls to recv aroundselect.select
), but might end up being unmaintainably complicated.(Note that
await read1_task1
could be made to work likeawait read1_token
by having_BaseWrapper.wrap
just explicitly clone a wrapper if it's given one.)