Closed fsaintma closed 7 years ago
@fsaintma and dear Maintainer
I tried to reproduce a host that could have this behavior. I think it is trying to connect to a host that is broken and where the TCP connection does not send TCP SYN ACK.
This can be achieved by setting an iptable that drops tcp packets on a port: http://unix.stackexchange.com/a/330537
You can try with our 'grenoble.iot-lab.info' server on port 22222 for example. (Or any port that is not open, == not 22 and not 2222)
time ssh -oConnectTimeout=5 -oProxyCommand='ssh localhost -W %h:%p' grenoble.iot-lab.info -p 22222
Connection timed out during banner exchange
real 0m5.012s
user 0m0.000s
sys 0m0.007s
time ssh -oConnectTimeout=5 grenoble.iot-lab.info -p 22222
ssh: connect to host grenoble.iot-lab.info port 22222: Connection timed out
real 0m5.012s
user 0m0.007s
sys 0m0.000s
I did not run ParallelSSH code, I just tried to reproduce the output with openssh-client to make it easily testable.
Hi, thanks for raising. Yes, looks like timeout parameter is not passed onto the proxy client. Simple fix fortunately.
Will also update the docstring regarding default timeout setting.
For the other comment, .10
to get 10 seconds, I think it is the retry mechanism that is adding the additional 30sec delay (default for num_retries
is 3). If the client is run with retries set to 1 it should give expected timeout for the one connection attempt.
Eg:
from __future__ import print_function
from datetime import datetime
client = ParallelSSHClient(['11.234.34.34'], num_retries=1, timeout=5)
start = datetime.now()
client.run_command('asdf', stop_on_errors=False)
end = datetime.now()
print(end-start)
Output:
0:00:05.004444
There is also a 5 second delay between connection attempts, so for retries=2 with timeout=5 total time before timeout would be 15sec.
This was not made clear in documentation though so will update accordingly.
Hmm, forget what I said, looks like timeout is in fact passed to proxy and works as expected.
With unreachable proxy and timeout set:
from __future__ import print_function
from datetime import datetime
from pssh import ParallelSSHClient
client = ParallelSSHClient(['localhost'], proxy_host='13.123.132.2',
timeout=1, num_retries=1)
now = datetime.now()
print(client.run_command('uname', stop_on_errors=False))
end = datetime.now()
print(end-now)
Output:
<..>
exception=("Error connecting to host '%s:%s' - %s - retry %s/%s",
'localhost', 22, 'timed out', 1, 1)
<..>
0:00:01.002312
Without unreachable proxy the above completes without errors.
So the additional delay you are seeing is coming from the retry attempts, not proxy timeout not being set.
Documentation update for the retry behaviour has been made.
Thanks @cladmi, there is a test for proxy failures already :)
Hi @pkittenis and @cladmi
I'm sorry but if I use your script with my unreachable host node-a8-111.grenoble.iot-lab.info I obtain the same result. Indeed in my case the proxy host is reachable but not the host behind the proxy host (eg. blocked TCP connection)
from __future__ import print_function
from datetime import datetime
from pssh import ParallelSSHClient
client = ParallelSSHClient(['node-a8-111.grenoble.iot-lab.info'],
proxy_host='grenoble.iot-lab.info',
proxy_user='my_login',
timeout=1,
user='root',
num_retries=1)
now = datetime.now()
print(client.run_command('uname', stop_on_errors=False))
end = datetime.now()
print(end-now)
Output
{'node-a8-111.grenoble.iot-lab.info':
host=node-a8-111.grenoble.iot-lab.info
cmd=<Greenlet at 0x7fd0683d39b0>
channel=None
stdout=None
stderr=None
stdin=None
exception=("Error connecting to host '%s:%s' - %s", 'node-a8-111.grenoble.iot-lab.info', 22, 'Connect failed')
}
0:02:08.040653
So it's not a proxy failure but a host failure behind a proxy.
Thanks in advance.
Ah, right so the behaviour is when proxy is reachable but remote host is not.
In that case, the timeout setting for proxy -> host connection is not configurable on our client side as that is a proxy server side connection made by the SSH server on proxy host. Client side retries are not applicable either, proxy host may be doing that also.
ParallelSSH, and any client for that matter, can only affect its own connections - the proxy initiated connection is not one of them.
That timeout configuration can only be changed on the proxy host itself.
Eg:
In [2]: client = ParallelSSHClient(['11.213.132.3'], proxy_host='localhost', timeout=1, num_retries=1)
In [3]: now = datetime.now()
...: print(client.run_command('uname', stop_on_errors=False))
...: end = datetime.now()
...: print(end-now)
...:
0:02:07.415910
Proxy host is localhost
and reachable. Target host does not exist.
Changing localhost SSH settings to lower timeout changes this:
{'11.213.132.3':
host=11.213.132.3
cmd=<Greenlet at 0x7f7423d8e730>
channel=None
stdout=None
stderr=None
stdin=None
exception=("Error connecting to host '%s:%s' - %s", '11.213.132.3', 22, 'Connect failed')
}
0:00:01.211800
In short, will need to set an appropriate ssh config on the proxy server for the desired timeout.
Thank you for your quick answers.
From what I understand, I don't think its a configuration that should be done on the server.
For me what happens in practice is
Tell me if I'm wrong, or if it is different with paramiko
as I know more openssh-client
.
So, I would think that the timeout configuration should be on this intermediate tunnel.
Even before -W
option was added to openssh, ProxyCommand was using netcat HOST PORT
to create the connection.
So maybe when creating the proxy_channel here:
https://github.com/ParallelSSH/parallel-ssh/blob/master/pssh/ssh_client.py#L147
I may be missing what the intermediate connection is really made of in paramiko
, and when it's actually trying to connect to the remote host. Is it a creation or when the proxy connects ? How do you get feedback on this connection ?
I will try tweaking this.
If there is no way to control this using tcp configuration, it would also be possible, but more complex and heavier, to have a local timer/thread/greenlet/process/signal/…, that closes the whole connection if it is not established after timeout
. But I would rather find a solution without this.
Indeed, adding "timeout=self.timeout" to open_channel does raise an exception in SSHClient:
You need a "recent" enough version of paramiko that supports this option. For example 1.15.2 did not and I had to upgrade.
Traceback (most recent call last):
File "./test_client.py", line 20, in <module>
num_retries=1)
File "/home/harter/tmp/test_parallel_ssh/pssh/ssh_client.py", line 127, in __init__
self._connect_tunnel()
File "/home/harter/tmp/test_parallel_ssh/pssh/ssh_client.py", line 149, in _connect_tunnel
timeout=self.timeout)
File "/home/harter/.local/lib/python2.7/site-packages/paramiko/transport.py", line 841, in open_channel
raise SSHException('Timeout openning channel.')
paramiko.ssh_exception.SSHException: Timeout openning channel.
However it is not currently handled in ParallelSSHClient:
./test_client.py
No handlers could be found for logger "pssh"
{}
0:00:01.594286
With this however, the connection time could be 2 x timeout
in the end as there are two connections.
So maybe the second timeout for connecting to the host should be (timeout - elapsed_time)
.
If you want we can do a PR with propositions to fix this. Just tell us if there would be things to respect.
Like if you want more (timeout - elapsed_time), or a new argument like proxy_timeout, or just ignore it as it's already the case when creating the proxy connection. Or constraints on how to present the error.
That's not quite what happens but yes, looks like there is now a timeout setting on the channel which can be set by client. It's a relatively small change, will need a bump in minimum paramiko version though.
What happens in practice on SSH protocol tunnel connections is
This is the 'direct-tcpip tunneling' feature of the SSH(v2) protocol
In the case of using the tunnel as a proxy to another SSH host, the tunnel connection is then used by client to initiate a new SSH channel to the remote host via the intermediate proxy.
ProxyCommand is an openssh client feature to use an external program to proxy through, not related here as ParallelSSHClient uses the SSH protocol's native tunneling feature instead.
The tunnel connection is initiated and managed by the SSH server. See for example connections created by:
client = ParallelSSHClient(['11.213.132.3'], proxy_host='localhost', timeout=1, num_retries=1)
client.run_command('blah')
$ netstat -anp | grep 22
Client -> proxy server
tcp 0 0 127.0.0.1:42522 127.0.0.1:22 ESTABLISHED 11115/python
Proxy server -> remote host
tcp 0 1 10.14.33.xx:55689 11.213.132.3:22 SYN_SENT -
As can be seen above, the remote host connection is coming from the SSH server's IP rather than client. It is timing out waiting for a syn/ack reply from remote host which does not exist, as expected. Changing timeout settings on the SSH server does affect this connection's timeout.
However, it is now possible for timeout to be passed on to server when initiating the tunnel connection which is easier and good to have.
Eg with timeout=self.timeout
on the tunnel connection like you showed:
now = datetime.now()
print(client.run_command('uname', stop_on_errors=False))
end = datetime.now()
print(end-now)
{}
0:00:01.098822
This will need some additional error handling though as previously any errors would happen when initiating the SSH connection via the tunnel, not when creating tunnel itself. That breaks output, note it is empty above.
Thanks for the info @cladmi, was not aware of that new setting :)
If you'd like to make changes needed would be happy to review, otherwise can wait for next release - it's not a big change so no problem either way.
What is needed is I think:
self.timeout
in tunnel in SSHClient like aboveThe first three are in tunnel_timeout branch already, can use that to test if it resolves your issue and/or branch off of.
Existing timeout setting is fine, don't need new one. By handling tunnel timeout on tunnel channel creation the code doesn't try to create a new ssh connection with its own timeout so no 2x timeouts.
@pkittenis and @cladmi
Congratulations !!!!!
I have just tested the tunnel_timeout branch and it's OK for me, it resolves my issue.
Connecting via SSH proxy grenoble.iot-lab.info:22 -> node-a8-111.grenoble.iot-lab.info:22 {'node-a8-111.grenoble.iot-lab.info': host=node-a8-111.grenoble.iot-lab.info exit_code=None cmd=<Greenlet at 0x7f7195602cd0> channel=None stdout=None stderr=None stdin=None exception=("Error connecting to host '%s:%s' - %s", 'node-a8-111.grenoble.iot-lab.info', 22, 'Timeout openning channel.') } 15.2399671078
Thanks very much.
Thank you
Thank you also for the info :)
Thanks very much @pkittenis for your work and reactivity !!! Just another service :) I have just seen that you have published a release candidate version (1.0.1) three days ago with the fix issue but it's not the case on PyPi (eg. 1.0.0). So should you update the release candidate on PyPi ? Thanks in advance.
Hi @pkittenis It's ok for Pypi and long life to parallel-ssh !!!
No worries, thanks for letting me know. The pypi release from travis failed for some reason.
Hi,
I have just tried to test with a Python script ParallelSSHClient and run_command with approximately 50 hosts behind a proxy host.
I've realized that if you had one host in your host list which didn't boot correctly but you still establish a blocking TCP connexion the run_command return the output after bit more two minutes (eg. raise exception Connect Failed). So I decided to pass timeout option to the client with 10 seconds but the behaviour is the same and I must wait two minutes. So I pushed my script on the proxy host and launch ParallelSSHClient without proxy_host option and the output is returned after 10 seconds with an exception as expected (eg. timed out) .
Can you confirm me this behaviour and how I should apply a timeout for my hosts list SSH connection when I specify a proxy_host ? Indeed I don't want to wait two minutes in the case of all hosts boot correctly except one.
You can see below my script and the result of the tests with one host.
Thanks in advance for your help
NOTE1 : If I want a timeout of 10 seconds I must specify .10 (with 10 I must wait 40 seconds). NOTE2 : I notice in your documentation that you have a timeout=120 in ParallelSSHClient method and a typo in docstring with "Defaults to 10".
script
with_proxy : script launch on my computer
without_proxy: script launch on the proxy host