PrefectHQ / prefect-ray

Prefect integrations with Ray
https://prefecthq.github.io/prefect-ray/
Apache License 2.0
63 stars 5 forks source link

Fix `RayTaskRunner` exception handling in Prefect >= 2.6.0 #60

Closed rpeden closed 1 year ago

rpeden commented 1 year ago

In Prefect 2.6.0, exception_to_crashed_state changed from a sync function to an async function that needs to be awaited. RayTaskRunner does not currently await this call, resulting in users seeing coroutine' object has no attribute 'type' when an unhandled exception occurs while a Ray worker node is trying to unpickle and load the task.

Consequently, it is difficult to see and understand what went wrong. Looking at logs from the Ray worker node(s) that tried to execute the failed task(s) will usually show what went wrong, but not all users who can submit work to a Ray cluster will have access to individual worker node logs.

Several users have encountered this issue, as noted in this Discourse post and issue #58.

Changes in this PR:

Closes #58

Example

Start by reproducing the error:

task_runner = RayTaskRunner( address="ray://localhost:10001" )

@task() def test_task(): raise KeyboardInterrupt()

@flow(task_runner=task_runner) def test_flow(): future = test_task.submit() future.result(10)

if name == "main": test_flow()


* Notice the `AttributeError: 'coroutine' object has no attribute 'result'` error instead of the error you raised
* Pull in the changes from this PR
* Notice that you now see the actual error caused by the exception: `ray.exceptions.TaskCancelledError: Task: TaskID(...) was cancelled`

### Checklist
<!-- These boxes may be checked after opening the pull request. -->

- [x] This pull request references any related issue by including "Closes #<ISSUE_NUMBER>"
    - If no issue exists and your change is not a small fix, please [create an issue](https://github.com/PrefectHQ/prefect-ray/issues/new/choose) first.
- [x] This pull request includes tests or only affects documentation.
- [x] Summarized PR's changes in [CHANGELOG.md](https://github.com/PrefectHQ/prefect-ray/blob/main/CHANGELOG.md)