celluloid / celluloid-io

UNMAINTAINED: See celluloid/celluloid#779 - Evented sockets for Celluloid actors
https://celluloid.io
MIT License
879 stars 93 forks source link

Handle MRI's Errno::EALREADY in connect_nonblock. Fixes #138 #142

Closed ioquatix closed 9 years ago

ioquatix commented 9 years ago

In certain high-load situations in RubyDNS (thousands of connections), it has been observed that Socket#connect_nonblock sometimes fails with Errno::EALREADY: a connection is already in progress for the specified socket. In this case I believe it is prudent that we wait for the socket to become writable and then attempt to connect_nonblock again, at which point the connection should either fail or give Errno::EISCONN.

ioquatix commented 9 years ago

If anyone has any more experience with this please feel free to chime in. /cc @tarcieri @digitalextremist @e2

tarcieri commented 9 years ago

cc @headius

This looks ok to me @ioquatix

digitalextremist commented 9 years ago

This looks good to go @ioquatix, if you've been able to simulate thousands of connections and verified? Even if you haven't - the logic is good for what you said. I wouldn't be surprised if Errno::EALREADY will need to be added here soon:

https://github.com/celluloid/reel/blob/0.17.0-dependent/lib/reel/server.rb#L29

digitalextremist commented 9 years ago

One odd thing @ioquatix, I re-ran the ruby-2.0.0 tests on CI and ( having nothing to do with this I'm 99% sure ) something is flaky here. Depending on how fast you need this, I would doubly-pull this to master as you have, and 0.17.0-dependent

ioquatix commented 9 years ago

@digitalextremist This was the work around I put in place

https://github.com/ioquatix/rubydns/blob/5f27bf2e3d444772f80c512219b759891aa61c31/lib/rubydns/resolver.rb#L239-L245

Without rescue Errno::EALREADY it would blow up. In this case I decided simply to drop the connection as it gets retried anyway. Whether you can make a useful spec from this or not is anyones guess, but it might be a good idea to have a failing spec before merging this, otherwise you are just going on my word that it fails and that this is the correct fix..

headius commented 9 years ago

Seems ok to me.

digitalextremist commented 9 years ago

Thanks @headius I merged this @ioquatix ...and brought it into 0.17.0-dependent also, before deleting the branch.

ioquatix commented 9 years ago

Okay thanks everyone for your time eyeballing this code, I look forward to removing the workaround in RubyDNS and testing once 0.17.0rc is ready.