python-trio / trio-asyncio

a re-implementation of the asyncio mainloop on top of Trio
Other
187 stars 37 forks source link

Intermittent deadlock in test_misc.test_cancel_sleep (macOS only?) #103

Closed shamrin closed 3 years ago

shamrin commented 3 years ago

Observation

I managed to replicate the deadlock on my Mac, but running the following in 6 console windows, simultaneously. It takes a few minutes to get to a deadlock:

while true; do pytest tests/test_misc.py; done

(By the way, Ctrl-C does nothing.)

We have seen the deadlock in GitHub Actions as well:

...
2021-01-07T13:36:00.2183290Z ../tests/test_misc.py::TestMisc::test_err2 PASSED                        [  2%]
2021-01-07T13:36:00.2328220Z ../tests/test_misc.py::TestMisc::test_run3 PASSED                        [  2%]
2021-01-07T13:45:34.4670010Z ##[error]The operation was canceled.
...

https://github.com/python-trio/trio-asyncio/runs/1662700976

Analysis

It deadlocks because of combination of two things:

  1. in test_cancel_sleep the do_no_run callback sometimes runs h.cancel() (timer inaccuracy?)
  2. raise Exceptions("should not run") triggers wait_task_rescheduled abort_cb callback in run_aio_future and the task never gets scheduled again.

https://github.com/python-trio/trio-asyncio/blob/b93c32037804298b43cc8c089313a2ef82ca0c22/tests/test_misc.py#L256-L271