liukd / ganymed-ssh-2

Automatically exported from code.google.com/p/ganymed-ssh-2
Other
0 stars 0 forks source link

connect with timeout bug (?) #49

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
SSH Server version: sshd version OpenSSH_6.6.1, OpenSSL 1.0.1i 6 Aug 2014
Ganymed release: ganymed-ssh2-262.jar
Client platform: Slackware 14.1

Hi all,

I'am using the Connection.connect method with timeouts (connectTimeout and
kexTimeout == DEFAULT_CONNECT_TIMEOUT == 10000).

public void myConnect(String dst, int port, String user, String passwd) {
    try {
        this.connection = new Connection(dst, port);
        ConnectionMonitor cmon = new ConnectionMonitor() {
            @Override
            public void connectionLost(Throwable cause) {
                System.out.println("connection lost: " + cause.toString());
            }
        };
        this.connection.addConnectionMonitor(cmon);
        System.out.println("connecting... " + user + "@" + dst + ":" + port);
        connection.connect(null, DEFAULT_CONNECT_TIMEOUT, DEFAULT_CONNECT_TIMEOUT);   // <---
        System.out.println("connected");
        System.out.println("authenticating...");
        boolean isAuthenticated = connection.authenticateWithPassword(user, passwd);
        if (isAuthenticated == false) {
            System.out.println("authentication failed");
        } else {        
            System.out.println("authenticated");
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}

after connection establishment, I send a command

public void myExecCommand(String cmd) {
    try {
            if (connection == null) {
                    return;
            }
            Session session = connection.openSession();
            session.execCommand(cmd);
            InputStream stdout = new StreamGobbler(session.getStdout());
            InputStream stderr = new StreamGobbler(session.getStderr());
            BufferedReader stdoutReader = new BufferedReader(new InputStreamReader(stdout));
            BufferedReader stderrReader = new BufferedReader(new InputStreamReader(stderr));
            while (true) {
                    String line = stdoutReader.readLine();
                    if (line == null) {
                            stdoutReader.close();
                            break;
                    }
                    System.out.println(line);
            }
            while (true) {
                    String line = stderrReader.readLine();
                    if (line == null) {
                            stderrReader.close();
                            break;
                    }
                    System.out.println(line);
            }
            System.out.println("exit status: " + session.getExitStatus());
            session.close();
    } catch (IOException e) {
            e.printStackTrace();
    }
}

command get executed successfully, then I wait for 1 minute, connection stays
up for 9.something sec then connectionLost() is called, this is the output:

connection lost: java.net.SocketTimeoutException: Read timed out

SSHD debug log below and tcpdump show that is the client that closes the 
connection.

Looks like the connect timeout fired.  Am I missing something obvious here 
(which is
quite possible :-))? Bug?

thank you
ciao
FF

PS
if I do *not* send any command connection stays up

--- sshd debug output ---

debug1: sshd version OpenSSH_6.6.1, OpenSSL 1.0.1i 6 Aug 2014
debug1: key_parse_private2: missing begin marker
debug1: read PEM private key done: type RSA
debug1: private host key: #0 type 1 RSA
debug1: key_parse_private2: missing begin marker
debug1: read PEM private key done: type DSA
debug1: private host key: #1 type 2 DSA
debug1: key_parse_private2: missing begin marker
debug1: read PEM private key done: type ECDSA
debug1: private host key: #2 type 3 ECDSA
debug1: could not open key file '/etc/ssh/ssh_host_ed25519_key': No such file 
or directory
Could not load host key: /etc/ssh/ssh_host_ed25519_key
debug1: rexec_argv[0]='/usr/sbin/sshd'
debug1: rexec_argv[1]='-d'
debug1: rexec_argv[2]='-p'
debug1: rexec_argv[3]='8888'
Set /proc/self/oom_score_adj from 0 to -1000
debug1: Bind to port 8888 on 0.0.0.0.
Server listening on 0.0.0.0 port 8888.
debug1: Bind to port 8888 on ::.
Server listening on :: port 8888.

debug1: Server will not fork when running in debugging mode.
debug1: rexec start in 5 out 5 newsock 5 pipe -1 sock 8
debug1: inetd sockets after dupping: 3, 3
Connection from 127.0.0.1 port 44864 on 127.0.0.1 port 8888
debug1: Client protocol version 2.0; client software version Ganymed_262
debug1: no match: Ganymed_262
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.6.1
debug1: permanently_set_uid: 33/33 [preauth]
debug1: list_hostkey_types: ssh-rsa,ssh-dss,ecdsa-sha2-nistp256 [preauth]
debug1: SSH2_MSG_KEXINIT sent [preauth]
debug1: SSH2_MSG_KEXINIT received [preauth]
debug1: kex: client->server aes128-ctr hmac-sha1-96 none [preauth]
debug1: kex: server->client aes128-ctr hmac-sha1-96 none [preauth]
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST received [preauth]
debug1: SSH2_MSG_KEX_DH_GEX_GROUP sent [preauth]
debug1: expecting SSH2_MSG_KEX_DH_GEX_INIT [preauth]
debug1: SSH2_MSG_KEX_DH_GEX_REPLY sent [preauth]
debug1: SSH2_MSG_NEWKEYS sent [preauth]
debug1: expecting SSH2_MSG_NEWKEYS [preauth]
debug1: SSH2_MSG_NEWKEYS received [preauth]
debug1: KEX done [preauth]
debug1: userauth-request for user tnc service ssh-connection method none 
[preauth]
debug1: attempt 0 failures 0 [preauth]
debug1: userauth-request for user tnc service ssh-connection method password 
[preauth]
debug1: attempt 1 failures 0 [preauth]
Accepted password for tnc from 127.0.0.1 port 44864 ssh2
debug1: monitor_child_preauth: tnc has been authenticated by privileged process
debug1: monitor_read_log: child log fd closed
User child is on pid 2645
debug1: permanently_set_uid: 1003/100
debug1: Entering interactive session for SSH2.
debug1: server_init_dispatch_20
debug1: server_input_channel_open: ctype session rchan 100 win 196608 max 32768
debug1: input_session_request
debug1: channel 0: new [server-session]
debug1: session_new: session 0
debug1: session_open: channel 0
debug1: session_open: session 0: link with channel 0
debug1: server_input_channel_open: confirm session
debug1: server_input_channel_req: channel 0 request exec reply 1
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req exec
Starting session: command for tnc from 127.0.0.1 port 44864
debug1: Received SIGCHLD.
debug1: session_by_pid: pid 2646
debug1: session_exit_message: session 0 channel 0 pid 2646
debug1: session_exit_message: release channel 0
debug1: session_by_channel: session 0 channel 0
debug1: session_close_by_channel: channel 0 child 0
debug1: session_close: session 0 pid 0
debug1: channel 0: free: server-session, nchannels 1

Connection closed by 127.0.0.1                            // <--------
debug1: do_cleanup
Transferred: sent 3144, received 928 bytes
Closing connection to 127.0.0.1 port 44864

Original issue reported on code.google.com by francesc...@gmail.com on 23 Aug 2014 at 5:56

GoogleCodeExporter commented 9 years ago
Is this likely to be fixed by the r=157 change which removes the idle flag from 
TransportManager? [Before this revision, this flag was inspected following a 
socket read timeout and the connection closed if not idle (e.g. a command has 
been executed in the mean time).]

If so when is the next release build expected?

Original comment by jeremy.s...@cyclone-technology.com on 2 Sep 2014 at 2:12

GoogleCodeExporter commented 9 years ago
Just grabbed the latest from trunk to give it a go. It looks like it now closes 
the connection for every read timeout. It was worth a try.

Original comment by jeremy.s...@cyclone-technology.com on 2 Sep 2014 at 2:29

GoogleCodeExporter commented 9 years ago

To cut a long story short I'm experiencing the same thing too in svn rev 161.

I've got an older jar called "ganymed-ssh2-build251beta1.jar" which does not 
exhibit this problem. (I don't have the source for this)

Quite a lot of refactoring has gone on, but I think snv rev 10 
"Patchset from Cyberduck.  * Option in constructor to pass custom 
identification  * No infite timeout for socket connect  * Idle detection for 
read timeouts" provides a clue.

This change added setting the timeout on the underlying socket during the 
connect -  tm.setSoTimeout(connectTimeout);
but does not reset it back to 0 (infinite timeout) so this connection timeout 
will be used on all subseqent reads.

I can't see any public access to the 'tm' member, so there doesn't appear to be 
a way to access the Transport manager directly or the socket to set the options 
back in application code - so if this is a problem for anyone then they'll 
probably have to build the source themselves.

Original comment by goo...@oweno.info on 19 Nov 2014 at 1:23

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
The attached patch fixed my issue.

Since I didn't know how to test the original problem I left the setting of the 
timeout on the underlying socket during connect() in place and just re-set it 
to zero at the end of connect()

It might be possible to remove both calls to setSoTimeout since all the calls 
to Socket.Connect have a timeout on - I guess it does some reading establishing 
the SSH connection onto of the TCP connection and the timeout on the read might 
be to allow for the rare case where a TCP connect happens but then there isn't 
anything on the other end of the socket to talk to.

Original comment by goo...@oweno.info on 19 Nov 2014 at 3:04

Attachments:

GoogleCodeExporter commented 9 years ago
Socket#setSoTimeout and Socket#connect should have different timeout values

I am having the same problem with rev161:

In ch.ethz.ssh2.Connection the connectTimeout is used for Socket#connect and 
for Socket#setSoTimeout:
    public synchronized ConnectionInfo connect(ServerHostKeyVerifier verifier, int connectTimeout, int kexTimeout)
            throws IOException {
        [...]
        tm.setSoTimeout(connectTimeout);
        [...]
        tm.connect(hostname, port, softwareversion, cryptoWishList, verifier, dhgexpara, connectTimeout,
                    getOrCreateSecureRND());
        [...]
    }

In ch.ethz.ssh2.transport.ClientTransportManager connect is done via:

    protected void connect(String hostname, int port, int connectTimeout)
            throws IOException {
        sock.connect(new InetSocketAddress(hostname, port), connectTimeout);
    }

I have following use case, we are using ganymed to also connect to windows MKS 
SSHD. The MKS SSHD tends to hang in High-Load Scenarios on our ESX-Servers. 
Therefore, the initial connect Socket#connect should timeout after 60 sec (in 
order to retry). However after a connection has been established it is normal 
and no failure that read-calls block for more than 60 sec, since there is 
nothing to read for a longer time. The problem is that due to 
Socket#setSoTimeout gets the very same value as Socket#connect I can't handle 
this situation. A third timeout parameter would solve the problem. Something 
like:

    public synchronized ConnectionInfo connect(ServerHostKeyVerifier verifier, int connectTimeout, int kexTimeout, int soTimeout)
            throws IOException {
        [...]
        tm.setSoTimeout(connectTimeout);
        [...]
        tm.connect(hostname, port, softwareversion, cryptoWishList, verifier, dhgexpara, connectTimeout,
                    getOrCreateSecureRND());
        [...]
        tm.setSoTimeout(soTimeout)
    }

Thanks & Best regards
Stefan

Original comment by stefan...@gmail.com on 20 Jan 2015 at 5:53

GoogleCodeExporter commented 9 years ago
The solution in comment #5 is better than introducing an extra parameter 
soTimeout. Since there is the very useful Session#waitForCondition Method a 
soTimeout would be no addition for functionality but would have the potential 
to confuse things.

Original comment by stefan...@gmail.com on 20 Jan 2015 at 6:25