Closed djmb closed 3 months ago
When using SSHKit with a 100+ hosts, we sometimes get IOErrors like this:
[ERROR (IOError): Exception while executing on host foo: closed stream]
Debugging showed that a packet was being sent to the remote server twice, which was causing it to close the connection. This was caused by two threads using the connection concurrently - the eviction thread and one of the SSHKit parallel runner threads.
The duplicate packets came from here in net-ssh, where both threads call send
before either calls output.consume!
.
Looking at the eviction code, the call to closed?(first_conn)
outside the synchronize block looked suspicious as it calls conn.process(0)
which will send a test packet.
The check looks like a performance optimisation, so I think it should be safe to remove. We've not seen the errors again with this fix running.
Thanks for the review @mattbrictson - I've committed your suggestion.
closed?
callsprocess
on the connection which is not safe because we have not synchronised the connection pool. Another thread might concurrently checkout the connection and start sending commands as well.