Open Zac-HD opened 2 months ago
I'm not sure how I feel about adding a keyword argument to trio.Lock
. Does trio.Semaphore
with a limit of 1
really have differences other than this? I like that trio.Lock
is more easily debuggable and harder to shoot yourself with. (and if we make acquire
error given the owner is dead, then even better!)
Perhaps we should raise an error or at least emit a warning when a task which owns a lock finishes?
I think that this could error when you run lock.acquire_nowait()
. i.e. lock.acquire_nowait
could check if self._owner
is still running. This doesn't handle the following though:
lock.acquire
lock.acquire
Let's set Semaphore
aside then, I'm more concerned that you can easily get a deadlock from Lock
, e.g.
lock = trio.Lock()
async with trio.open_nursery() as nursery:
nursery.start_soon(lock.acquire)
nursery.start_soon(lock.acquire)
IMO instead of hanging forever, we should raise BrokenResourceError
in the task(s) waiting on a lock which can never be released, and perhaps also when exiting the task which holds that lock.
I agree; BrokenResourceError
(or RuntimeError
if we want) is better than a deadlock. Though I think that if a lock gets acquired and never released, but nobody else wants it, it's OK. (Though it's definitely a sign of a bug? Should that warn?)
I think so long as we get a BrokenResourceError
instead of hanging, and can identify the offending lock + task if that happens, the other decisions are less crucial - e.g. I'd prefer to warn-or-error on task exit but only if that's not going to cost us performance in the happy case.
Erroring on
lock = trio.Lock()
async with trio.open_nursery() as nursery:
nursery.start_soon(lock.acquire)
await lock.acquire()
Is easy & straightforward, adding a check in acquire_nowait
for self._owner not in self._owner._parent_nursery._children
.
But the other cases are definitely more complicated, since the other task(s) might start waiting before the task owner exits. I think the way to go is to
Runner.task_exited
which checks if the exiting task is holding any lock
error_other_holders
I don't think this requires a big performance hit, but we'd likely need to add some overhead whenever acquiring/releasing a lock, which perhaps partially could be traded off with more logic in task_exited
.
I think better than a parent nursery check is just checking if the task is running. But yeah I'm not sure how to handle the other cases. Should there be a way to "poison" a parking lot?
Another way of implementing it (inspired by @mikenerone) is having an Event
firing on a Task
exiting, and that event would wake up any other tasks waiting inside acquire
. This initially sounded conceptually simple, but when I started thinking about how to actually implement it (acquire
waiting for that Event
firing or its ParkingLot.park()
, and the lock event changing each time the lock owner changes) I'm not sure how clean it would be.
Here's my latest idea: we can have a global dictionary of task -> a list (not a set because need to allow multiple of the same element) of parking lots, w/ an interface like trio.lowlevel.add_parking_lot_breaker(task, parking lot)
and trio.lowlevel.remove_parking_lot_breaker(task, parking lot)
.
Then lock uses that to have its parking lot broken if the task gets killed without releasing the lock. We can document this for use for anything that expects a task to release something but didn't.
Then lock uses that to have its parking lot broken if the task gets killed without releasing the lock
The lock initiates it?
Trying to flesh this out:
_LockImpl.acquire_now
calls trio.lowlevel.add_parking_lot_breaker(self._owner, self._lot)
_LockImpl.release
calls trio.lowlevel.remove_parking_lot_breaker(self._owner, self._lot)
Runner.task_exited
calls trio.lowlevel.break_parking_lots(task)
ParkingLot.break()
on all associated lotsParkingLot.break
calls Runner.reschedule(..., next_send=Outcome.Error(BrokenResourceError(...)))
for each parked entity (if that's how you inject errors into other Task
s).Then lock uses that to have its parking lot broken if the task gets killed without releasing the lock
The lock initiates it?
Yeah, if I understand what you're asking.
Trying to flesh this out:
_LockImpl.acquire_now
callstrio.lowlevel.add_parking_lot_breaker(self._owner, self._lot)
_LockImpl.release
callstrio.lowlevel.remove_parking_lot_breaker(self._owner, self._lot)
Runner.task_exited
callstrio.lowlevel.break_parking_lots(task)
We don't have to expose trio.lowlevel.break_parking_lots
, it's fine if the runner just uses the global dict.
i. This calls
ParkingLot.break()
on all associated lots ii.ParkingLot.break
callsRunner.reschedule(..., next_send=Outcome.Error(BrokenResourceError(...)))
for each parked entity (if that's how you inject errors into otherTask
s).
ParkingLot.break
also:
1) needs a different name because break
is a keyword
2) should make any future ParkingLot.park
calls error.
The error should probably contain a reference to the task that caused things to break. Maybe a subclass of BrokenResourceError
.
How to handle future improvements when we can have multiple guest runs able to run at once?
The Task
shouldn't compare the same, though if it does then ouch that's something we should fix. (Given there's no way to start a task on two loops). So it's fine to use it as a key.
we can have a global dictionary of task -> a list (not a set because need to allow multiple of the same element) of parking lots
why is it necessary to allow multiple of the same element? It's not required for lock, so I'm guessing you're thinking of some external scenario?
Opened #3081 with an initial implementation. It needs some polish but the overall approach looks solid
we can have a global dictionary of task -> a list (not a set because need to allow multiple of the same element) of parking lots
why is it necessary to allow multiple of the same element? It's not required for lock, so I'm guessing you're thinking of some external scenario?
Yeah I was thinking of a semaphore (cause it's still bad form to forget to release) and reentering within the same task.
we can have a global dictionary of task -> a list (not a set because need to allow multiple of the same element) of parking lots
why is it necessary to allow multiple of the same element? It's not required for lock, so I'm guessing you're thinking of some external scenario?
Yeah I was thinking of a semaphore (cause it's still bad form to forget to release) and reentering within the same task.
This made me look more into the other sync primitives, and CapacityLimiter
might be a pretty good fit for an optional parameter that enables the same functionality (optional because of the _on_behalf_of
methods). Semaphore
would be a worse fit, but I suppose could also have it. (at that point we probably move the logic from _LockImpl
to AsyncContextManagerMixin
, or an intermediate class between those).
Some time ago - never mind how long precisely - having a little coordination between tasks and nothing particular to interest me in semaphores, I saw our
RuntimeError("can't release a Lock you don't own")
.I approve pretty strongly of keeping things local by default; it prevents all sorts of bugs and
trio.Semaphore
is right there if you need something without this limitation. But two further observations prompted this issue:trio.Semaphore()
is considerably more powerful and flexible in general. By the principle of least power, it might be nice to havetrio.Lock(*, allow_outside_release: bool = False)
or something along those lines.