dabeaz / curio

Good Curio!
Other
4.01k stars 240 forks source link

Tasks cannot be cancelled twice #329

Closed agronholm closed 3 years ago

agronholm commented 3 years ago

I noticed while working on AnyIO that certain tests were hanging with the Curio back-end. I started digging deeper and found that once a task has been cancelled, its cancelled flag is not cleared even when the cancellation is delivered – unlike on asyncio. This causes problems when implementing cancel scopes for the Curio back-end since a task can only receive one cancellation before hanging. I could also not find any other way to interrupt a task from another task.

Here's a quick script to demonstrate the issue:

import curio

async def main():
    async def cancel():
        await curio.sleep(0.1)
        await task.cancel(blocking=False)

    event = curio.Event()
    task = await curio.current_task()
    for _ in range(2):
        await curio.spawn(cancel, daemon=True)
        print('waiting on the event')
        try:
            await event.wait()
        except curio.TaskCancelled:
            print('task cancelled')

curio.run(main)

As a workaround, I could simply clear the flag after receiving a cancellation. I just wanted to know if this behavior was intentional or not.

dabeaz commented 3 years ago

When a task is cancelled, it is given a single TaskCancelled exception and that's it, regardless of how many times the cancel() method might be called afterwards. Tasks are not supposed to ignore cancellation and keep on running. This behavior is documented as part of the Task interface. https://curio.readthedocs.io/en/latest/reference.html#tasks

agronholm commented 3 years ago

Okay. Can you suggest an alternate method for interrupting a task from other task task? Some cleaner way to implement cancel scopes?

dabeaz commented 3 years ago

Why wouldn't you just cancel it? If a task wants to cancel another task (and it has a reference to the task), you just call await task.cancel() on it.

I don't know what you're trying to do with the above code, but it's ignoring the cancellation request and going back to blocking on an event. Cancellation is supposed to lead to actual termination of the task. If you block on something that never happens, then of course it's going to deadlock.

agronholm commented 3 years ago

With all due respect, I feel that is a rather narrow way of thinking about cancellation. The issue came up during the implementation of the Happy Eyeballs algorithm (which asyncio and trio have implemented but curio has not). This involves spawning several tasks that attempt the connection concurrently (with a configurable stagger delay). The first task will then cancel the rest of the tasks, including the host task which will be either waiting for the first task to complete or it may still be in the process of spawning new tasks.

This is just an example, and the deeper issue is that this is the way task groups work, and they were modeled after trio's nurseries. It is much too late in the game to even think about changing the fundamental design because one of the back-ends turns out not to support this. Yes, I should have read the fine print but what's done is done. I really want to keep curio on board – otherwise the project is not worthy of being called "AnyIO". Thus, I wonder: you may not have intended tasks to be used this way, but can you think of any adverse effects of clearing the cancelled flag on the task once an AnyIO cancel scope receives the cancellation?

Perhaps @njsmith can also chime in on this as the original designer of this pattern?

dabeaz commented 3 years ago

A Curio implementation of happy eyeballs is in the examples directory (`examples/happy') and has been there for quite some time now.

The behavior of cancellation in Curio is that task.cancel() delivers a TaskCancelled exception to a task exactly once. That is all it does--there are no other side effects, no modal behavior, or alteration of how other parts of Curio work. That is by design because the behavior can be reasoned about and described in one short sentence. Should a task decide to catch TaskCancelled and go on to block on some other operation, then it's going to block. Don't do that if it matters.

As for clearing the cancelled flag, it's probably harmless, but you should audit the Curio source to make sure.

imrn commented 3 years ago

That's why it is better for a framework not to make unnecessary assumptions. A task may receive any exception and may decide to do anything. Cancellation is no exception. If you try to regulate it, you'll bring complexity into the framework. That's trio way.

I can understand that you are trying to have some determinism for cancellation. BUT this idealism conficts with reality.

Is ignoring an exception is a malpractice? Most probably it is. But, this is not absolute.

So the moral of the story is that framework should not interfere to that.

Can any framework do anything about infinite loops, bad design causing deadlocks, or other nonsense?

Of course, you can not. And, you should not.

agronholm commented 3 years ago

@imrn I'm not sure which one of us your message was addressed to, but I didn't get it either way.

imrn commented 3 years ago

Some history: #149

After this issue and some experimentation, I've became opponent of any interference to cancellation or any exception by the framework. Framework should know nothing about them. Framework should only know whether the task is terminated or not. That's it. That also means cancellation (specific form of exception) is not needed. Only sending exceptions to tasks by anyone is enough.

agronholm commented 3 years ago

So...are you suggesting somebody here changes their design...? How? Or what was that about then?

imrn commented 3 years ago

Above comments are PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

On 7/22/20, Alex Grönholm notifications@github.com wrote:

So...are you suggesting somebody here changes their design...? How? Or what was that about then?

agronholm commented 3 years ago

Ok, I'm going to stop commenting because you're not making any sense.

dabeaz commented 3 years ago

Independent of the above comments, there is probably some friction due to the overall focus of Curio. I see it as something extremely minimal that can be used as the basis for implementing async programming. Yes, you could use it directly, but you could also use it to build something more advanced if you were so inclined. To support the latter, it's important (to me at least) that Curio be as free of side-effects as possible. So, for most things, given the choice between "doing something" and "doing nothing", I'd rather Curio "do nothing", but make it possible to "do something" if you want (although it might require some coding on your part).

imrn commented 3 years ago

@dabeaz: I know, this is not a well formed request here, but in the long run you perhaps may consider a generic exception sending mechanism to tasks which may replace cancellation.

As we've experienced in #149 how things get messy trying to handle everything, Now, I'd doubt it is useful. And in the end such attempts still do not cover everyday complexities and it still does not save us from implementing custom solutions. So even I had created the issue I'm now asking to myself: Why have we done it?

Anyway, I think current form of cancellation may be simplified and perhaps may be replaced with a generic exception sending mechanism. Curio may become neutral to the exceptions sent to the tasks so that, it is up to the tasks what to do. This way people may create tasks which terminate after receiving 10 TaskCancelled exceptions. ;)

agronholm commented 3 years ago

So what would happen if task A wanted to send an exception to task B but task C had already queued another exception to be sent to B?

imrn commented 3 years ago

So what would happen if task A wanted to send an exception to task B but task C had already queued another exception to be sent to B?

Exception queues are bad design and they should be consumed fresh.

149 {defer, allow}_cancellation experiment has important lessons.

If there is an absolute need for this you may develop your custom design.

I understand that you are trying to make curio handle that messy thing as a feature. I guess alternative frameworks may cover it. That's why curio should not.

agronholm commented 3 years ago

How would you force those exceptions to be "consumed fresh" then? The only way I can think of is forcing the sending task to block until the exception has been delivered to the target task, and I sure wouldn't want that.

dabeaz commented 3 years ago

The cancel() method already allows a custom exception to be supplied--so, in principle you could use it to deliver any exception that you want to a task.

On the whole, I must admit that I've grown lukewarm to most advanced ideas involving cancellation. There are a lot of things that can quickly go sideways with it (and asynchronous exceptions in general). I'd present Unix signal handling as an example.

Getting back to the original example, I wonder if doing something like this would cause the effect of propagating a cancellation to the next blocking call (as an alternative of messing around with internal cancelled flag).

try:
     ... whatever ...
except TaskCancelled as exc:
     await set_cancellation(exc)
...
await some_blocking_op()     # Immediately cancelled due to it being reset above
agronholm commented 3 years ago

This is not what I want, however. I already have in place a system where a would-be blocking call triggers any pending cancellation. What I want is the ability to undo a cancellation, so that such calls can take place after a cancel scope has been exited, and that those calls can again be cancelled. This is possible with both asyncio and trio. And since we won't know where and when the cancellation will be delivered to the task, I've opted to clear the cancelled flag right after calling cancel() for now. We'll see how that goes.

dabeaz commented 3 years ago

Frankly, it still feels weird to me that you'd structure a task in a way where it ignores a cancellation request and then ultimately goes on running in a way as to require a second cancellation. Task.cancel() means "cancel", not "maybe cancel." If cancelling a task somehow required additional, potentially cancellable, work to be performed afterwards, I'd probably be inclined to structure that additional work as a separate task as opposed to running it as a continuation of the first cancelled task.

agronholm commented 3 years ago

I don't see any other way to implement trio-like structured concurrency. If a task in a group raises an exception, the host task needs to be made to drop whatever it's doing, cancel all the other tasks in the group and wait until they've all finished and then raise an exception depending on what exceptions the tasks in the group raised while they were terminating. If none of the tasks raised any exceptions and were successfully cancelled, the task group's async context manager simply exits and the task continues from there.

oremanj commented 3 years ago

@dabeaz - I don't think @agronholm particularly cares about Task.cancel(), specifically, being able to happen multiple times. It's just that Task.cancel() is the only API that curio exposes for cancelling anything, and he wants to be able to implement cancellation for units of work smaller than a task (e.g. as in trio's cancel scopes).