liarcn / parallel-ssh

Automatically exported from code.google.com/p/parallel-ssh
Other
0 stars 0 forks source link

ControlPersist breaks pssh and pscp #67

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
pssh and pscp work pretty well until I start using ControlPersist in 
~/.ssh/config:

Host *
   ControlMaster auto
   ControlPersist 30
   ControlPath ~/.ssh/master-%r@%h:%p

and then they always take more than 30 seconds to complete.

$ time pssh -H "$hosts" date
[1] 21:33:34 [SUCCESS] s1n1
[2] 21:33:34 [SUCCESS] s1n2
[3] 21:33:34 [SUCCESS] s2n1
[4] 21:33:34 [SUCCESS] s2n2
pssh -H "$hosts" date  0.07s user 0.05s system 0% cpu 30.671 total

$ time (for host in $hosts; do ssh $host 'echo `hostname`: `date`'; done)
s1n1: Fri Mar 2 21:34:10 GMT 2012
s1n2: Fri Mar 2 21:34:13 GMT 2012
s2n1: Fri Mar 2 21:34:11 UTC 2012
s2n2: Fri Mar 2 21:34:11 UTC 2012
( for host in $hosts; do; ssh $host 'echo `hostname`: `date`'; done; )  0.05s 
user 0.05s system 8% cpu 1.120 total

After a bit of experimentation, it is pretty clear that pssh does not complete 
until the newly created backgrounded ssh connections terminate.  A valid 
workaround is:

  export PSSH_OPTIONS='ControlPersist=no'

but I don't see why it shouldn't be able to work properly with ControlPersist 
enabled.

Original issue reported on code.google.com by adam.spi...@gmail.com on 2 Mar 2012 at 9:45

GoogleCodeExporter commented 9 years ago
Hmm. Without ControlPersist, it definitely makes sense that ControlMaster would 
make pssh hang. However, I agree that it seems like ControlPersist should be 
able to work with pssh. I'll have to look into what tricky stuff ssh does with 
processes that is causing pssh to get confused. It's possible that there's 
something subtle going on, but I'll have to look into this to find out.

Original comment by amcna...@gmail.com on 2 Mar 2012 at 11:13

GoogleCodeExporter commented 9 years ago
Okay, it looks like this might be a bug in ssh. With the ControlPersist option, 
ssh forks into two processes. Standard output closes when the main process 
terminates, but the background process hangs on to standard error until it 
terminates. Unfortunately, pssh has no way of knowing that there will not be 
any further data on standard error, so it has to keep reading until the pipe is 
closed.

If the background ssh process were to close standard error, then pssh would be 
able to quit right when the main process completes. Other than fixing the 
behavior of ssh, I'm not sure what can be done about this. I can think of a 
dirty hack, but I'm not really comfortable with it.

Original comment by amcna...@gmail.com on 2 Mar 2012 at 11:39

GoogleCodeExporter commented 9 years ago
This problem in ssh has been noted by several other people, 

https://bugzilla.mindrot.org/show_bug.cgi?id=1330#c1
https://lwn.net/Articles/401651/

I have opened an OpenSSH bug about this issue:

https://bugzilla.mindrot.org/show_bug.cgi?id=1988

Original comment by amcna...@gmail.com on 2 Mar 2012 at 11:53

GoogleCodeExporter commented 9 years ago
Hmm, you did what I was about to suggest quicker than I could type the 
suggestion :-)  Thanks!

Original comment by adam.spi...@gmail.com on 2 Mar 2012 at 11:55

GoogleCodeExporter commented 9 years ago
And thank you for letting me know about the problem.

Original comment by amcna...@gmail.com on 3 Mar 2012 at 2:10

GoogleCodeExporter commented 9 years ago
Is this problem fixed in current pssh/ssh versions?

Original comment by mkosm...@gmail.com on 6 May 2014 at 6:44

GoogleCodeExporter commented 9 years ago
I need to make a new release, but I've been a little swamped. I'll try to get 
to it soon. Sorry for the delays.

Original comment by amcna...@gmail.com on 6 May 2014 at 2:45

GoogleCodeExporter commented 9 years ago
What's the status of this?

Original comment by bluew...@server-speed.net on 25 Oct 2014 at 9:43