thefuyang / parallel-ssh

Automatically exported from code.google.com/p/parallel-ssh
Other
0 stars 0 forks source link

Killed by signal 9 #23

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1.parallel-ssh -A -h ips.txt -l username uptime
2.
3.

What is the expected output? What do you see instead?
It shows success one time but after that i get Timed out, Killed by signal 9
and my CPU get full while trying that.

What version of the product are you using? On what operating system?
pssh 2.1

Please provide any additional information below.
Operating system Debian Squeeze

OUTPUT:
parallel-ssh -A  -v -h ips.txt -l username -o /tmp/foo -e /tmp/error ps
Warning: do not enter your password if anyone else has superuser
privileges or access to your account.
Password: 
[1] 09:14:34 [FAILURE] 10.211.10.10 Timed out, Killed by signal 9
[2] 09:14:34 [FAILURE] 10.211.10.72 Timed out, Killed by signal 9

Original issue reported on code.google.com by chutygo...@gmail.com on 15 Sep 2010 at 3:51

GoogleCodeExporter commented 8 years ago
Please can you tell me whats wrong. I could to once but cannt do now.
Also note that with ssh key it is OK.

Original comment by chutygo...@gmail.com on 15 Sep 2010 at 8:15

GoogleCodeExporter commented 8 years ago
I've been trying to think what could cause this.  You said it only happens with 
the "-A" option, right?  Are you typing the password right away?  How long does 
it take before you see the "Timed out, Killed by signal 9" message?  Is it 
pretty close to instant, or is there a delay?

Original comment by amcna...@gmail.com on 15 Sep 2010 at 3:56

GoogleCodeExporter commented 8 years ago
By the way, did you kill it, or did that happen all on its own?  Does anything 
relevant show up in the system logs (e.g., /var/log/messages)?

Original comment by amcna...@gmail.com on 15 Sep 2010 at 3:58

GoogleCodeExporter commented 8 years ago
i did not kill it.
Interestingly i got success only once.
yap there is a delay.
Its take a while and then saying Timed out and signal 9.
Interestingly when its trying to connect i takes hole lot of prcess of my CPU.
I am using debian squeeze. I also tried on debian lenny unfortunate its the 
same things. you can try and also send me some success log...
THanks for reply...and also waiting for more comments.....

Original comment by chutygo...@gmail.com on 18 Sep 2010 at 5:48

GoogleCodeExporter commented 8 years ago
Hmm.  Which process is using all of the CPU?  Is it pssh or ssh or pssh-askpass 
(running the "top" program can help if you aren't already familiar with this)?  
It sounds like ssh is being killed, not pssh or pssh-askpass, so I'm suspecting 
that the ssh process is the one using a lot of CPU.

I haven't been able to reproduce this yet, and I'm still thinking about what 
could be causing it; let me know if there's anything else that you think might 
be relevant.  Thanks.

Original comment by amcna...@gmail.com on 20 Sep 2010 at 4:21

GoogleCodeExporter commented 8 years ago
I'm having a similar issue. Using the -A option, even with only a single entry 
in the host file results in a delay of about 60 seconds and an error: " Timed 
out, Killed by signal 9". top shows little CPU usage but pssh-askpass does seem 
to be consuming the most, albeit only ~1.3%. I'm running pssh version 2.1.1 on 
Ubuntu 10.04

The command is:  pssh -l root -A -i -h test.txt uptime

text.txt contains only 1 line/entry: 127.0.0.1

The error and result is no different if I include a username in the hosts file 
or not.

TIA,
Terry

Original comment by rouss...@gmail.com on 30 Nov 2010 at 3:50

GoogleCodeExporter commented 8 years ago
Unfortunately, I have tried a few times to reproduce this on Fedora without any 
luck.  I will try to find a Ubuntu machine in a few minutes and see if I have 
any more success there.  Do you have any other suggestions that might help me 
reproduce it?  Thanks.

Original comment by amcna...@gmail.com on 2 Dec 2010 at 7:00

GoogleCodeExporter commented 8 years ago
Okay, I found a Ubuntu 10.04 machine and added a hosts file and ran:

parallel-ssh -l amcnabb -A -i -h test.txt whoami

but this did not timeout or hang.  I'm trying to think of anything that might 
be different in the environment.

Do you have the same problem if you ssh into your machine (without X 
forwarding) and run pssh there?  Is there anything if interest in your 
.ssh/config or in your system's ssh config?

Original comment by amcna...@gmail.com on 2 Dec 2010 at 7:19

GoogleCodeExporter commented 8 years ago
I have the same issue here.  pssh 2.1.1 on Debian squeeze.  (I'm using public 
key authentication for SSH, so it's not having to wait for a password.)  Slow 
hosts consistently cause a timeout, Killed by signal 9.  

Original comment by j.paul.l...@gmail.com on 18 Dec 2010 at 2:27

GoogleCodeExporter commented 8 years ago
For those who are having this problem, if you try to ssh to one of the hosts 
manually, do you get a "The authenticity of host 'xyz' can't be established" 
message?  I'm still trying to find information about how to reproduce the 
problem.  Thanks.

Original comment by amcna...@gmail.com on 8 Jan 2011 at 9:38

GoogleCodeExporter commented 8 years ago
Also, if you specify "-x '-v'" along with "-i" do you see the following error 
message over and over?

debug1: read_passphrase: can't open /dev/tty: No such device or address

If this is it, then I think I may have just reproduced it.

Original comment by amcna...@gmail.com on 8 Jan 2011 at 9:50

GoogleCodeExporter commented 8 years ago
By the way, if it's the /dev/tty thing that I described in comment #11, then I 
think that this can only happen if pssh is being run as root.  Can anyone 
confirm this as well?  Thanks.

Original comment by amcna...@gmail.com on 8 Jan 2011 at 10:21

GoogleCodeExporter commented 8 years ago
Okay, I think I have a solution to this in commit 8b9fb2c.  Give it a try if 
you have a chance; if it works I would like to cut a new release just for this 
problem.

For those who are interested in details, it looks like the problem is that pssh 
didn't know what to do when the -A option was set and ssh asked the question:

"""
The authenticity of host 'xyz (192.168.1.1)' can't be established.
RSA key fingerprint is 33:4d:a6:09:ea:45:09:39:86:a9:cb:33:93:91:93:fe.
Are you sure you want to continue connecting (yes/no)?
"""

Thanks, everyone, for helping me track this down, and please let me know 
whether or not commit 8b9fb2c solves the problem for you.

Original comment by amcna...@gmail.com on 9 Jan 2011 at 5:52

GoogleCodeExporter commented 8 years ago
Sorry, commit 566a881 should actually work.

Original comment by amcna...@gmail.com on 9 Jan 2011 at 5:59

GoogleCodeExporter commented 8 years ago
After some further testing, I'm going to mark this as fixed.  However, if you 
run into any further problems despite the fix, please reopen the issue.  Thanks 
for your patience with this problem.

Original comment by amcna...@gmail.com on 10 Jan 2011 at 2:38