Handling exceptions in curio.run()

agronholm commented 6 years ago

I am trying to set up a test runner for curio, but I am at a loss regarding the handling of exceptions raised during curio.run(). When curio.run() passes through the exception from the coroutine function, it seems like it won't properly shut down.

Given the following code:

import curio

async def main():
    raise Exception('foo')

curio.run(main)

Running it will give me the following output:

$ python curiotest.py 
Task Crash: Task(id=2, name='main', state='TERMINATED')
Traceback (most recent call last):
  File "/home/alex/.local/share/virtualenvs/hyperio-VaFdKoYY/lib/python3.5/site-packages/curio/kernel.py", line 736, in _run_coro
    trap = current._send(current.next_value)
  File "/home/alex/.local/share/virtualenvs/hyperio-VaFdKoYY/lib/python3.5/site-packages/curio/task.py", line 167, in _task_runner
    return await coro
  File "curiotest.py", line 5, in main
    raise Exception('foo')
Exception: foo
Traceback (most recent call last):
  File "curiotest.py", line 8, in <module>
    curio.run(main)
  File "/home/alex/.local/share/virtualenvs/hyperio-VaFdKoYY/lib/python3.5/site-packages/curio/kernel.py", line 818, in run
    return kernel.run(corofunc, *args)
  File "/home/alex/.local/share/virtualenvs/hyperio-VaFdKoYY/lib/python3.5/site-packages/curio/kernel.py", line 179, in run
    raise ret_exc
  File "/home/alex/.local/share/virtualenvs/hyperio-VaFdKoYY/lib/python3.5/site-packages/curio/kernel.py", line 736, in _run_coro
    trap = current._send(current.next_value)
  File "/home/alex/.local/share/virtualenvs/hyperio-VaFdKoYY/lib/python3.5/site-packages/curio/task.py", line 167, in _task_runner
    return await coro
  File "curiotest.py", line 5, in main
    raise Exception('foo')
Exception: foo
Exception ignored in: <bound method Kernel.__del__ of <curio.kernel.Kernel object at 0x7f6bbfc1eb00>>
Traceback (most recent call last):
  File "/home/alex/.local/share/virtualenvs/hyperio-VaFdKoYY/lib/python3.5/site-packages/curio/kernel.py", line 122, in __del__
RuntimeError: Curio kernel not properly terminated.  Please use Kernel.run(shutdown=True)

What I expect instead is for curio to just pass through the exception without anything else.

I also tried this:

with curio.Kernel() as kernel:
    try:
        kernel.run(main)
    except BaseException:
        kernel.run(shutdown=True)

But this won't work either: AttributeError: 'NoneType' object has no attribute 'append'. So my question is: what is the proper way to shut down the kernel after an exception?

agronholm commented 6 years ago

Further experimentation showed that the following approach would work:

with curio.Kernel() as kernel:
    kernel.run(main, shutdown=True)

The only remaining issue is that it reports the task crash and there seems to be no way to disable that with kernel.run(). Also, why doesn't curio.run() use the shutdown=True flag?

dabeaz commented 6 years ago

This is indeed interesting. Looking at the source, it only calls shutdown if run() returns with normally or with a SystemExit/KeyboardInterrupt exception. The only motivation I can think of for this would be to leave the kernel in it's crashed state so that you could run the Python debugger afterwards.

Regarding the reporting of crashes... Curio never ignores exceptions in tasks (meaning that they're never simply discarded). However, in the process of working on Curio, I found that the actual handling of a crashed task could be delayed, possibly indefinitely, by buggy code elsewhere. For example, suppose a child task gets launched, it unexpectedly crashes, but some parent task gets deadlocked waiting for some event to happen in the child. In that case, you might get a situation where nothing appears to be happening. Moreover, you don't get any kind of message or any other indication that something is broken somewhere. To deal with this, all curio tasks log uncaught exceptions that cause a task to crash. It can be suppressed by either configuring the logging module or spawning tasks with the report_crash=False option.

agronholm commented 6 years ago

I know about report_crash=False but how do I use it for the root task which is spawned by kernel.run()?

dabeaz commented 6 years ago

Hmmm. Good point. Let me take a look at this. I'm not wedded to the way that Curio does things now, but I am generally concerned about the issue of "errors should not pass silently."

In working on earlier versions Curio, I have experienced many frustrating hours of debugging where something wasn't working, yet no error messages were being generated--typically due to the delayed reporting of an exception combined with some other issue such as blocking/deadlock. To deal with that, I have chosen an approach that intentionally makes Curio noisy unless you take some explicit steps to silence it (i.e., "I know that this thing I'm doing is going to crash and I want it ignored").

For awhile, this was controlled by a global setting, but I found that it too suffered from my original debugging problem. It was too easy to just turn the reporting off and for code to seemingly work fine. Then, something weird happens, everything is frozen and no errors are being printed.

I'd also emphasize that Curio is NOT like threads where a crash merely logs and is otherwise ignored. Crashes DO propagate back to the parent (via task.join(), Task groups, etc.). It's just that Curio is both logging the crash and propagating it.

agronholm commented 6 years ago

Yes – I was just wondering if it's necessary to log the crash of the task currently being run via kernel.run() since it will be propagated anyway?

dabeaz commented 6 years ago

I am starting to reevaluate the choice of using the report_crash option. I definitely don't want crashes to be silent by default. However, modifying run() to take a report_crash option seems kind of hacky to me. I might go back to enabling it via debug options instead.

agronholm commented 6 years ago

I'd very much like the option of disabling crash reporting on a case by case basis. My use case is the hyperio library which I am building. I have my own task management there and would like to continue working with curio tasks without interference, or having to change a global setting.

dabeaz commented 6 years ago

Just a note that I recently pushed a change that refines this. Tasks submitted via run() no longer log crashes--since such crashes are reported back as a proper exception anyways. The main purpose of logging is to figure out what's happening with concurrently running tasks where termination can be more complicated.

agronholm commented 6 years ago

Thank you, this is just what I wanted!

dabeaz / curio

Handling exceptions in curio.run() #275