dabeaz / curio

Good Curio!
Other
4.02k stars 241 forks source link

How do i access crashed task call stacks from TaskGroupError? #227

Closed goldcode closed 4 years ago

goldcode commented 6 years ago

Referring the code from http://curio.readthedocs.io/en/latest/reference.html

async def bad1():
    raise ValueError('bad value')

async def bad2():
    raise RuntimeError('bad run')

try:
    async with TaskGroup() as g:
        await g.spawn(bad1)
        await g.spawn(bad2)
        await sleep(1)
except TaskGroupError as e:
    print('Failed:', e.errors)   # Print set of exception types
    for task in e:
        print('Task', task, 'failed because of:', task.exception)

Ideally i would want access to the following (which curio std outs below) in the TaskGroupError Object.

Task 4 crashed
Traceback (most recent call last):
  File "C:\Users\DSARN\AppData\Local\Programs\Python\Python36\lib\site-packages\curio\kernel.py", line 826, in _run_coro
    trap = current._send(current.next_value)
  File "C:\Users\DSARN\AppData\Local\Programs\Python\Python36\lib\site-packages\curio\task.py", line 96, in _task_runner
    return await coro
  File "C:/Evobase2005/Main/EvoPro/dc/tests/sandbox.py", line 12, in bad2
    raise RuntimeError('bad run')
dabeaz commented 6 years ago

Each exception has a __traceback__ attribute that holds the traceback. You might have to format it using the traceback module.

goldcode commented 6 years ago

thanks for the tip. i keep forgetting that exceptions can be queried for their tracebacks.

dabeaz commented 6 years ago

I wonder if I should give curio some easier way of producing a traceback for this situation.

goldcode commented 6 years ago

Don't know what to say really here. i sprang into the deeper end of the pool by learning curio and python at the same time. Perhaps extending the example at the except site could additionally help beginners like me. but it feels like python already provides the programmer with exception semantics. all this per task stack info seems to be getting more relevant with coroutines though.

from the code below, i ideally only wanted to use the except Exception handler. But now i had to extend my code to collect all crashed call stacks by using except TaskGroupError handler.

This is a subtle difference for the end user when an exception is thrown from a coroutine under (or not under) the dominion of a TaskGroup.

i guess if there is anyway that curio could attach all this crashing tracebacks so that its transparently accessible by logging.exception, that would help. don't know if its possible though.

from curio import *
import traceback, logging

async def foo():
    raise ValueError('test')

async def bad1():
    await foo()

async def bad2():
    raise RuntimeError('bad run')

async def main():
    try:
        # await bad2() not under the dominion of TaskGroup, so traceback context is clear.
        async with TaskGroup() as g:
            await g.spawn(bad1)
            await g.spawn(bad2)

    except TaskGroupError as e:
        crashed_tracebacks = []
        print('Failed:', e.errors)   # Print set of exception types
        for task in e:
            crashed_tracebacks.append(''.join(traceback.format_tb(task.exception.__traceback__)))
        logging.exception(''.join(crashed_tracebacks))
    except Exception as e:
        logging.exception('shit happens')

run(main)
goldcode commented 6 years ago

Here is an instance of how it gets a bit more complex in quasi-production code, since I'm additionally usually at a task.join but not always. essentially all i want to do is note down all errors in my logger (which is btw not a python logger but a wrapper around c++ based logger due to historical reasons.)

try:
    async with measure.ClassicHCS(self._device, experiment_xml, meas_details) as self._measurement:
        measurement_task = await curio.spawn(self._measurement.run)
        measurement_complete = await measurement_task.join()

except curio.TaskCancelled as tc:
    await measurement_task.cancel()
except (curio.TaskError, curio.TaskGroupError, Exception) as e:
    task_group_error = None
    if isinstance(e.__cause__, curio.TaskGroupError):
        task_group_error = e.__cause__
    elif isinstance(e, curio.TaskGroupError):
        task_group_error = e

    if task_group_error:
        crashed_tracebacks = []
        for task in e.__cause__:
            crashed_tracebacks.append(''.join(traceback.format_exception(etype=None, value=task.exception, tb=task.exception.__traceback__)))
        all_crashed_tasks = ''.join(crashed_tracebacks)
        logger.exception(
            f'measurement guid: {meas_details.meas_guid} was interrupted. measurement recovery may be initiated at next application start... Crashed Task(s): {all_crashed_tasks}')
    else:
        logger.exception(f'measurement guid: {meas_details.meas_guid} was interrupted. measurement recovery may be initiated at next application start...')