I'm currently investigating an issue where a thread is permanently stuck waiting for a message to arrive on a socket established over a VPN (and the VPN connection goes down).
As far as I can tell from my debugging, IOLoop isn't detecting when the socket is broken - PollPoller keeps carrying on being used without raising an error.
While investigating this - I'm noticing that it is difficult to get IOLoop to stop cleanly (if I don't get it to stop, it prevents my application from terminating cleanly and being automatically restarted).
I can call IO.stop, but then PollPoller.update_poll complains that the file descriptor is -1. So I considered the better way forward would be to try and get IOLoop to terminate cleanly, and it has a stop method which should work. However, I can't call that from my code, because it interacts with _data which is a threadlocal. That means that only the IO thread can clean up the loop via that method - except I can't see any invocation of that method in the codebase.
Rather than interacting directly with IOLoop, I'm sure that I should just be closing the connection object and that should trigger a clean (and exception-free!) shutdown - but in my situation, it doesn't clear it up sufficiently.
I'm currently investigating an issue where a thread is permanently stuck waiting for a message to arrive on a socket established over a VPN (and the VPN connection goes down).
As far as I can tell from my debugging, IOLoop isn't detecting when the socket is broken - PollPoller keeps carrying on being used without raising an error.
While investigating this - I'm noticing that it is difficult to get IOLoop to stop cleanly (if I don't get it to stop, it prevents my application from terminating cleanly and being automatically restarted).
I can call
IO.stop
, but thenPollPoller.update_poll
complains that the file descriptor is-1
. So I considered the better way forward would be to try and get IOLoop to terminate cleanly, and it has astop
method which should work. However, I can't call that from my code, because it interacts with_data
which is a threadlocal. That means that only the IO thread can clean up the loop via that method - except I can't see any invocation of that method in the codebase.Rather than interacting directly with IOLoop, I'm sure that I should just be closing the connection object and that should trigger a clean (and exception-free!) shutdown - but in my situation, it doesn't clear it up sufficiently.