capistrano / sshkit

A toolkit for deploying code and assets to servers in a repeatable, testable, reliable way.
MIT License
1.13k stars 253 forks source link

Avoid calling closed? outside of synchronize block #534

Closed djmb closed 3 months ago

djmb commented 3 months ago

closed? calls process on the connection which is not safe because we have not synchronised the connection pool. Another thread might concurrently checkout the connection and start sending commands as well.

djmb commented 3 months ago

When using SSHKit with a 100+ hosts, we sometimes get IOErrors like this:

[ERROR (IOError): Exception while executing on host foo: closed stream]

Debugging showed that a packet was being sent to the remote server twice, which was causing it to close the connection. This was caused by two threads using the connection concurrently - the eviction thread and one of the SSHKit parallel runner threads.

The duplicate packets came from here in net-ssh, where both threads call send before either calls output.consume!.

Looking at the eviction code, the call to closed?(first_conn) outside the synchronize block looked suspicious as it calls conn.process(0) which will send a test packet.

The check looks like a performance optimisation, so I think it should be safe to remove. We've not seen the errors again with this fix running.

djmb commented 3 months ago

Thanks for the review @mattbrictson - I've committed your suggestion.