Open zanieb opened 1 year ago
So don't do that? I fail to see any good design where creating a context manager and then not using it would make sense.
Often, we'll create a context manager ahead of time (during a configuration step, for example) then downstream enter the context. In my application, we are using fail_after
to enforce configurable timeouts. We set up the context but do not want to enter it until right before we call the (library user's) code that needs a timeout. At other times, when there is not a timeout, we'll use a null context to simplify downstream logic.
If we weren't ever using the context manager, this wouldn't be an issue. The issue here is that the default behavior is antithetical to typical context manager behavior. It's like having the following return the amount of time since initialization rather than within the context:
timer = timer()
with timer:
...
It's also not an atypical design to use things like an AsyncExitStack
— in this case developers may create the contexts beforehand then enter them in a loop.
Trio would have to change their semantics from what has been documented, and so far there haven't been any practical use cases that would warrant this.
@agronholm can you link to the documentation that covers this behavior? I did not see any explicit mention of this.
Here: https://trio.readthedocs.io/en/stable/reference-core.html#trio.fail_after
Creates a cancel scope with the given timeout, and raises an error if it is actually cancelled.
Implies that the cancel scope gets created with a set deadline. Says nothing about the deadline getting set when entering the scope.
Ah I think that can be interpreted either way based on your assumptions.. the example shows with fail_after() as cancel_scope
so it's pretty ambiguous when that scope is created — it could just as well be created on __enter__
and yielded. In fact, you cannot get the CancelScope
object itself until after entering the context (in anyio).
As there are 0 examples of separating the instantiation of this context manager from its use in a with
statement in the documentation of either trio or anyio, I think it would be safe to assume that it was not a supported use case.
If it was not a supported use-case then what is the objection to changing the behavior to be clearer?
To be clear: I can work around this and will need to do so for compatibility with older versions of anyio/trio no matter what is decided here. I will likely add my own utility with the behavior I need — I am only advocating for an API that is less likely to be misleading here.
If it was not a supported use-case then what is the objection to changing the behavior to be clearer?
If anybody anywhere is already using the context manager in such a disjointed manner correctly, their code will be broken. I wouldn't expect anybody to be doing that, but then again they wouldn't report that to us unless something wasn't working. I also didn't expect anybody to be using the cm like this at all, so who knows?
I think a change in semantics should be considered but only if someone brings forth a plausible use case where doing this is necessary or at least beneficial.
Yeah unfortunately timeouts are kind of fuzzy in the first place — we had it like this for a long time before we got a bug report from a user. From a cursory search on GitHub, I don't see any uses apart from ours where the initialization and entry are separated. I'll open a pull request to document this caveat. We can see if more people chime in on this issue, I see no reason to rush an API change.
I don't think there's any problem necessarily with calculating the deadline at the last moment, though I can imagine someone being confused the other way as well. ("I created the fail_after
object early because I wanted that to be when the deadline was relative to.")
BTW, re:
In my application, we are using fail_after to enforce configurable timeouts. We set up the context but do not want to enter it until right before we call the (library user's) code that needs a timeout. At other times, when there is not a timeout, we'll use a null context to simplify downstream logic.
Trio accepts math.inf
as a timeout everywhere, meaning "never timeout". So you could pass the bare timeout value around everywhere (either the configured timeout, or math.inf
if there's no timeout), and then unconditionally call fail_after
around the operation that has the timeout. Might be easier to understand, independently of whatever happens with this discussion?
I'd vote +0.25 for changing the behavior (but +1 for documenting things more stringently).
("I created the fail_after object early because I wanted that to be when the deadline was relative to."
You can easily calculate an absolute timeout at that time instead.
The way things currently are, you can't achieve the behavior @madkinsz wants as easily, thus this'd be a net win for that kind of flexibility in case anybody ever wants it.
Thanks for the additional responses!
Trio accepts math.inf as a timeout everywhere, meaning "never timeout". So you could pass the bare timeout value around everywhere (either the configured timeout, or math.inf if there's no timeout), and then unconditionally call fail_after around the operation that has the timeout. Might be easier to understand, independently of whatever happens with this discussion?
Definitely! I plan on doing this in the short term.
For more context: I'm also working on a refactor of how we're orchestrating user code to reduce the duplication of some complex portions of code and something I'm exploring is passing a list of contexts to a function that uses an exit stack to enter them all. This separation of configuration of the contexts that should be active during execution of the user code and management of actual execution of the code greatly simplifies our runtime logic. Of course, I can make this accept Callable[[], ContextManager]
too to get past this particular issue, but then for more complex context managers I'll be passing around a bunch of partial
objects 🤷♀️ tradeoffs.
I think it would make sense for trio.fail_after(x)
to be relative to entering the context manager; trio.fail_at(trio.current_time() + x)
seems a more natural way to express the relative-to-construction-time semantics. (and the same for trio.move_on_{at,after}
)
Either way, we should document the chosen semantics.
Concretely thinking about possible ways of going about implementing a change in behaviour (sorry if somewhat rambly/stream-of-thought, I think I personally prefer the new class that transforms into a cancelscope)
One option would be introducing an attribute, relative
, to CancelScope
- such that when CancelScope.__enter__
is called it does
if self.relative: self._deadline = trio.current_time() + self._deadline
CancelScope
s (before entering) and doing logic based on the value of deadline
, if they are passed CancelScope
s that have the relative
flag setdeadline
after entering the CancelScope
[depending on the value of relative]. Or if the value of relative
is modified.Could also add an attribute relative_deadline
... but that brings up problems with what should happen if you access the deadline
attribute.
CancelScope
We can introduce a new class (PendingTimeout
, PendingRelativeCancelScope
, RelativeCancelScope
...?) which when __enter__
ed returns a CancelScope
. It does need to be exposed for the sake of typing, and you probably want to be able to modify the timeout after creation, before entering. It probably makes sense to also expose shield
.
There's still some danger of deadline confusion if writing:
my_timeout = fail_after(5)
with my_timeout as timeout:
# oh no, we want 7 seconds, not 5
my_timeout.deadline = 7
# ...except that's not how cancelscopes work
although I guess that has always been the case.
If trying to access deadline
on this object you will get an error (since it's not defined as an attribute at all), so some of the code relying on current behavior will get caught this way, and still others will get caught by typing errors.
example usage
my_timeout: PendingTimeout = fail_after(5)
my_timeout.shield = True
my_timeout.seconds = 7
my_timeout.fail = False # this is now a move_on_after instead
with pytest.raises(
AttributeError,
match="'PendingTimeout' object has no attribute 'deadline'"
):
my_timeout.deadline += 2
cs: CancelScope
with my_timeout as cs:
with pytest.raises(
AttributeError,
match="'CancelScope' object has no attribute 'seconds'"
):
cs.seconds += 2
# this
with trio.PendingTimeout(5, shield=True, fail = False):
...
# would be equivalent to
with trio.move_on_after(5) as cs:
cs.shield = True
...
A third option would be introducing a subclass of CancelScope
, RelativeCancelScope
, which has an attribute seconds
(or timeout
? The parameter to fail_
& move_on_after
is currently called seconds
). Trying to get/set deadline
on the object before entering will raise an error (which means code relying on current functionality will get a helpful error mesage), but seconds
can be modified freely. Once entered though, if modifying seconds
the change will flow down to the underlying CancelScope
, and if modifying deadline
the seconds
parameter will also be updated.
Code using current functionality will not get any typing errors, since RelativeCancelScope
is a subclass of CancelScope
.
example usage
my_timeout: RelativeCancelScope = fail_after(9)
with pytest.raises(
AttributeError, # or some other error
match='cannot set deadline on RelativeCancelScope before entering.'
):
my_timeout.deadline += 3
my_timeout.seconds = 5
with my_timeout: # __enter__ returns Self, so don't even need to do `as`
# assume that trio.current_time() was 100 upon entering
assert my_timeout.deadline == 105
my_timeout.seconds = 20
assert my_timeout.deadline == 120
my_timeout.deadline = 150
assert my_timeout.seconds = 50
my_timeout.deadline = 97 # nothing illegal about setting deadlines in the past on cancelscopes
assert my_timeout.seconds = -3
I'm not sure how much this actually differs from adding a new attribute tbh.
could also go all the way to introducing a fully new RelativeCancelScope
that does not inherit from CancelScope
. But that seems overkill.
I think a new subclass would work well. On the seconds
side of things, in the past I've wished it were measured in smaller time increments, but I think it's probably fine to leave as seconds. The division operation to make things into seconds isn't really going to be that bad.
I favor adding a ._relative_deadline
attribute to the existing CancelScope
class.
fail_after()
then creates a CancelScope(deadline=inf)
and sets the _relative_deadline
CancelScope.__enter__
then sets deadline = min(deadline, _relative_deadline + current_time())
Modifying the deadline or relative deadline before entering seems fine, though I'm still using a leading underscore to discourage the latter. We could make the relative deadline a property so that attempting to modify it after entering checks if self._has_been_entered
and raises an error.
After being entered, modifying either the absolute or relative deadline seems reasonable to me - it'd just implicitly fetch the current time then set the other to match.
Finishing up the PR, when I realized that the behaviour of relative_deadline
after entering is not obvious.
with CancelScope(relative_deadline = 5) as cs:
await trio.sleep(1)
print(cs.relative_deadline) # should this print 5, or 4?
cs.relative_deadline = 3 # does this mean "deadline in 3 seconds", or "3 seconds after entering"?
My first intuition (and implementation) was that it would print 5, and then "deadline in 3 seconds" - but that means that the value of cs.relative_deadline
becomes ~worthless:
with CancelScope(relative_deadline = 5) as cs:
await trio.sleep(random.randint(3))
cs.relative_deadline = 1 + random.randint(3)
await trio.sleep(random.randint(3))
print(cs.relative_deadline) # this value now has zero relation to when the scope will time out.
Possible alternatives:
relative_deadline
property after entering will return deadline - current_time()
. The answers to the first questions are "4 seconds" and "deadline in 3 seconds"._time_of_entering
to the cs, set to current_time()
upon entering. All relative times are now relative to entering the scope. The answer to the first questions are "5 seconds" and "3 seconds after entering".relative_deadline
after entering to remove a potential footgun. The 1st option doesn't actually add any functionality after all, cs.relative_deadline = x
is equivalent to cs.deadline = trio.current_time() + x
and cs.relative_deadline
is equivalent to cs.deadline - trio.current_time()
. I think the first one might be more intuitive and useful? But can see a case for any of the options. I'll push to #3010 soonish with an implementation.
This is a duplicate of https://github.com/agronholm/anyio/issues/514 — they are following the convention established here in trio.
When using a
fail_after
context, the deadline is set at "initialization" time rather than__enter__
. This behavior can result in unexpected bugs if the context is declared before it is entered. This is not particularly intuitive for a Python context manager — I'd expect the timer to start when the context is entered. I do not think changing it would be complex, but it could be considered breaking.