We're having an issue in our CI environment where our container is sporadically timing out, and we think it may be because the Chrome client is dying silently or disconnecting, and Apparition is not realizing it.
I'm able to reproduce the issue of Apparition hanging locally by starting some specs in one terminal window, and killing the Chrome process that is spawned via another terminal window using killall -m Chrome (careful if you are actually using Chrome at the time). Killing Chrome causes apparition to hang permanently - the only way to get the specs to stop is Ctrl+C (twice).
After looking into why that happens, I thought this issue might hold the answer https://github.com/faye/websocket-driver-ruby/issues/61 . Apparently, it's possible that the Websocket Driver's connection fails without firing the :close event. In addition to listening for that event, we should also check the return value of #read and ensure it's not nil or an empty string.
Well, I tried just checking for nil and "", and it didn't seem to work. I was able to verify that the socket.read call was where apparition was hanging, and it seems the readpartial call is blocking, so I replaced it with read_nonblock. That method is, according to the docs, the exact same other than the blocking flag. Once I used the non-blocking form of read, and raised an error when the timeout occurs, I see the expected error output, but apparition continues to hang 😞 . This happens _even if I set the abort_on_exception flag to true on the @listener thread. 😕 I have no idea why.
I'm hoping you can help me build on this PR to properly address this issue.
We're having an issue in our CI environment where our container is sporadically timing out, and we think it may be because the Chrome client is dying silently or disconnecting, and Apparition is not realizing it.
I'm able to reproduce the issue of Apparition hanging locally by starting some specs in one terminal window, and
kill
ing the Chrome process that is spawned via another terminal window usingkillall -m Chrome
(careful if you are actually using Chrome at the time). Killing Chrome causes apparition to hang permanently - the only way to get the specs to stop is Ctrl+C (twice).After looking into why that happens, I thought this issue might hold the answer https://github.com/faye/websocket-driver-ruby/issues/61 . Apparently, it's possible that the Websocket Driver's connection fails without firing the
:close
event. In addition to listening for that event, we should also check the return value of#read
and ensure it's notnil
or an empty string.Well, I tried just checking for
nil
and""
, and it didn't seem to work. I was able to verify that thesocket.read
call was where apparition was hanging, and it seems thereadpartial
call is blocking, so I replaced it withread_nonblock
. That method is, according to the docs, the exact same other than the blocking flag. Once I used the non-blocking form ofread
, and raised an error when the timeout occurs, I see the expected error output, but apparition continues to hang 😞 . This happens _even if I set theabort_on_exception
flag to true on the@listener
thread. 😕 I have no idea why.I'm hoping you can help me build on this PR to properly address this issue.