Closed SomberNight closed 6 years ago
I've pinpointed the cause.
The CancelledError
is raised here:
https://github.com/kyuupichan/aiorpcX/blob/d4c4cc39876b755c6ff39638604380bcb5b4a213/aiorpcx/socks.py#L301
which will result in the explicit close of the socket at
https://github.com/kyuupichan/aiorpcX/blob/d4c4cc39876b755c6ff39638604380bcb5b4a213/aiorpcx/socks.py#L322-L323
^ If I comment out sock.close()
, the event loop won't die, and it seems to "work".
Thanks. That's obviously not the answer though....
I'm inclined to see this as a bug in asyncio no?
My intial impression is it seems to be to be another problem caused by use of futures and callbacks (because they occur delayed) rather than pure awaiting in the internals of asyncio. In particular, between the cancellation and asyncio's handling of its callback effects, aiorpcX gets control and closes the socket, which is entirely reasonable. asyncio then continues to handle the cancellation in a callback, but assumes the socket is still open.
I'll try and create a simple case and, if still a problem, submit a bug report in Python.
I'm struggling to reproduce more simply. Why the thread in your code? Does it happen without it?
I tried this but it's fine:
import asyncio
import socket
class MyProtocol(asyncio.Protocol):
def connection_made(self, transport):
transport.write(b'123') # just in case a write is needed
port = 6666
async def connect_and_recv(loop, sock):
try:
await loop.sock_connect(sock, ('127.0.0.1', port))
while True:
await loop.sock_recv(sock, 20)
except asyncio.CancelledError:
print("Cancelled")
sock.close()
async def main(loop):
server = await loop.create_server(MyProtocol, '127.0.0.1', port)
sock = socket.socket()
sock.setblocking(False)
task = loop.create_task(connect_and_recv(loop, sock))
await asyncio.sleep(0.1)
task.cancel()
await asyncio.sleep(0.1)
loop = asyncio.get_event_loop()
loop.run_until_complete(main(loop))
https://github.com/python/cpython/blob/master/Lib/asyncio/selector_events.py#L378-L380 needs to happen to trigger it. Not sure what might cause that or how to cause it in the testcase. I suspect if the testcase can trigger those lines, then it would reproduce
Why the thread in your code? Does it happen without it?
Yes, actually it's redundant.
Your example reproduces it for me on Windows (10.0.16299.665). It enters the block you suspected.
I've added a traceback print there. Trace on py3.6.6 (same block is entered on 3.7.0):
>>> loop.run_until_complete(main(loop))
Traceback (most recent call last):
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\asyncio\selector_events.py", line 381, in _sock_recv
data = sock.recv(n)
BlockingIOError: [WinError 10035] A non-blocking socket operation could not be completed immediately
Traceback (most recent call last):
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\asyncio\selector_events.py", line 381, in _sock_recv
data = sock.recv(n)
BlockingIOError: [WinError 10035] A non-blocking socket operation could not be completed immediately
Cancelled
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\asyncio\base_events.py", line 455, in run_until_complete
self.run_forever()
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\asyncio\base_events.py", line 422, in run_forever
self._run_once()
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\asyncio\base_events.py", line 1398, in _run_once
event_list = self._selector.select(timeout)
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\selectors.py", line 323, in select
r, w, _ = self._select(self._readers, self._writers, [], timeout)
File "C:\Users\User\AppData\Local\Programs\Python\Python36-32\lib\selectors.py", line 314, in _select
r, w, x = select.select(r, w, w, timeout)
OSError: [WinError 10038] An operation was attempted on something that is not a socket
but this example does not trigger it on Linux (just like the original one did not). I can try to come up with an example for Linux but as I said, something different happens there (only the trace is printed, loop does not die).
Windows and Linux have different event loops because the underlying O/S handles "select"ing differently. Anyway, it's good to know my code does it on Windows; that's enough to file a bug report. It's clear looking at the asyncio code what the issue is, just not how to trigger it. The remove_reader is delayed and in the meantime the socket can be closed (in which case the original fd is invalid, and socket._fileno is set to -1)
Meanwhile I think the best solution is your solution - to not close the socket, and let the garbage collector do it, lame though that feels.
This seems related: https://bugs.python.org/issue30064 which actually reproduces the OSError on Windows, and the ValueError on Linux.
@SomberNight Can you try with uvloop and setting EVENT_LOOP_POLICY to uvloop? In the issue I filed above they're asking; unfortunately I'm not in a state to reproduce your issue at all at present.
I can't reproduce the issue with uvloop on linux. (looks like uvloop does not work on windows atm)
awesome, thanks!
So my example is on Windows, as I am having a hard time making a minimal example on Linux... but there is also a problem on Linux, even if not exactly the same.
code:
result:
Now, this example is not enough to reproduce it on Linux; but there the loop does NOT die, but the following trace appears:
Tested on both python 3.6.6 and 3.7.0