Closed Zwork101 closed 5 years ago
😕 yeah that's not supposed to happen. Let's see if we can track it down.
Fortunately the traceback contains a lot of information. The relevant part is this:
File "C:\Users\User\PycharmProjects\pycord\venv\lib\site-packages\trio\_core\_run.py", line 1471, in run_impl
runner.task_exited(task, final_outcome)
File "C:\Users\User\PycharmProjects\pycord\venv\lib\site-packages\trio\_core\_run.py", line 943, in task_exited
task._cancel_stack[-1]._remove_task(task)
File "C:\Users\User\PycharmProjects\pycord\venv\lib\site-packages\trio\_core\_run.py", line 202, in _remove_task
self._tasks.remove(task)
File "C:\Users\User\AppData\Local\Programs\Python\Python37-32\lib\contextlib.py", line 119, in __exit__
next(self.gen)
File "C:\Users\User\PycharmProjects\pycord\venv\lib\site-packages\trio\_core\_run.py", line 156, in _might_change_effective_deadline
del runner.deadlines[old, id(self)]
File "C:\Users\User\PycharmProjects\pycord\venv\lib\site-packages\sortedcontainers\sorteddict.py", line 259, in __delitem__
self._dict_delitem(key)
KeyError: (-inf, 60341456)
I think what it's saying is:
__enter__
and never called __exit__
)_remove_task
could change the effective deadline is if this was the last task associated with this cancel scope. But like we said above, the only way we should have been able to get here is if this task inherited the scope from its parent, and the parent should still be running, so this shouldn't be the last task in the scope.KeyError
. This is also weird, because the invariant is supposed to be that any cancel scope that has at least one task, isn't cancelled, and whose deadline is not +inf
, should have an entry in the table.-inf
, which is odd and seems like it should be a clue, but I'm not immediately sure what to make of it. The built-in checkpoint
function creates cancel scopes with a deadline of -inf
, but those should exit immediately; I don't see how one could be attached to an exiting task. The current_effective_deadline
function uses -inf
to indicate that some enclosing cancel scope has already been cancelled, but that doesn't change the actual deadline of any specific cancel scope, unless you do something like with move_on_at(current_effective_deadline())
.So, the first questions that come to mind are:
-inf
?yield
ing out of a with
block is one way to accidentally break the normal with
block rules.I downloaded your code and trio_websocket and looked for red flags. The only suspicious thing I found was that your code appears to create a websocket connection in one call to trio.run() and try to close it in another call:
async def run(self):
async with trio_websocket.open_websocket_url(self.gateway_url()) as conn:
self._conn = conn
...
async def _start(self):
async with trio.open_nursery() as nursery:
nursery.start_soon(self.run)
nursery.start_soon(self.heartbeat)
print("Started Nursery")
def start(self):
trio.run(self._start)
async def _close(self):
if self._conn is None:
raise GatewayError("You tried to close the gateway connection before it was established.")
await self._conn.aclose(1001)
self._closed = True
def close(self):
trio.run(self._close)
In general, different calls to trio.run() are in different universes and it won't work to share state between them. I wasn't able to reconstruct a story for how you could get this specific error by doing that, but it's a thing to look out for.
Also from your code it looked like maybe it was possible to call close() from inside the start() loop? i.e., a run() inside a run(). That won't work either, but trio.run() is supposed to have a check for that which provides a useful error. Unless there are threads (real OS threads, not trio tasks) somewhere that I'm missing -- each thread can have its own run(), and trying to use the same websocket connection from multiple threads simultaneously could explain the behavior you're seeing.
Thanks for the bug report, and sorry not to be able to provide a more definitive answer!
Thanks! I might be doing a lot of things wrong with trio I don't know about. To answer some questions:
I think I ran it using PyCharms's debugger. Could that have caused issues?
You've pinned a very old version of trio_websocket in your requirements. Update it and see if the issue stays?
Will do, didn't realize it was old.
Oh, nice catch. But I don't see any mention of inf
or yield
or current_effective_deadline
in the old trio-websocket 0.2.0 either, so still no smoking gun...
Closing since I don't see how we could realistically make progress on this with the information we have, and the multiple calls to trio.run() create lots of opportunity for some unexpected and unsupported interaction. Feel free to reopen if it reoccurs.
(To be clear in case anyone stumbles across this from a search engine years from now: Doing multiple calls to trio.run
is fine and totally supported. What's not supported is creating Trio objects like network connections, events, locks, etc. inside one call to trio.run
, and then re-using the same objects in a different call to trio.run
.)
What's the issue Trio, after a long period of time, will stop working and raise an internal error.
Steps to reproduce
Traceback (It's a long one)
Expected Result The program should continue running, displaying the output such as the first 2 lines provided in the traceback.