mscdex / ssh2

SSH2 client and server modules written in pure JavaScript for node.js
MIT License
5.46k stars 667 forks source link

Timed out while waiting for handshake stuck on Outgoing: Writing KEXINIT #927

Open RunsetTech opened 3 years ago

RunsetTech commented 3 years ago

Hi i've wroten a node js script that connects to AWS cli then create instances and after some timeout (120 seconds) attach the static ips then connect to them and on ready event exucute some commands. this proccess goes well but some times it get stuck on 1 istance

Error: Timed out while waiting for handshake
    at Timeout._onTimeout (C:\nodejsprojs\ssh-test2\node_modules\ssh2\lib\client.js:695:19)
    at listOnTimeout (internal/timers.js:549:17)
    at processTimers (internal/timers.js:492:7) {
  level: 'client-timeout'
}

and after it detach ip and attach another one and after some timeout tries to connect but this error happen untill infinity. i can connect to the same server with normal SSH but with ssh2 its not possible. here is the debug logs:

C:\nodejsprojs\ssh-test2>node test.js
DEBUG: Local ident: 'SSH-2.0-ssh2js0.4.10'
DEBUG: Client: Trying 15.188.191.179 on port 22 ...
DEBUG: Client: Connected
DEBUG: Parser: IN_INIT
DEBUG: Parser: IN_GREETING
DEBUG: Parser: IN_HEADER
DEBUG: Remote ident: 'SSH-2.0-OpenSSH_7.6p1 Ubuntu-4'
DEBUG: Outgoing: Writing KEXINIT
DEBUG: Parser: IN_PACKETBEFORE (expecting 8)
DEBUG: Parser: IN_PACKET
DEBUG: Parser: pktLen:1076,padLen:6,remainLen:1072
DEBUG: Parser: IN_PACKETDATA
DEBUG: Parser: IN_PACKETDATAAFTER, packet: KEXINIT
DEBUG: Comparing KEXINITs ...
DEBUG: (local) KEX algorithms: curve25519-sha256@libssh.org,curve25519-sha256,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha1
DEBUG: (remote) KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256,diffie-hellman-group14-sha1
DEBUG: KEX algorithm: curve25519-sha256@libssh.org
DEBUG: (local) Host key formats: ssh-ed25519,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-rsa
DEBUG: (remote) Host key formats: ssh-rsa,rsa-sha2-512,rsa-sha2-256,ecdsa-sha2-nistp256,ssh-ed25519
DEBUG: Host key format: ssh-ed25519
DEBUG: (local) Client->Server ciphers: aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm,aes128-gcm@openssh.com,aes256-gcm,aes256-gcm@openssh.com
DEBUG: (remote) Client->Server ciphers: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
DEBUG: Client->Server Cipher: aes128-ctr
DEBUG: (local) Server->Client ciphers: aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm,aes128-gcm@openssh.com,aes256-gcm,aes256-gcm@openssh.com
DEBUG: (remote) Server->Client ciphers: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com
DEBUG: Server->Client Cipher: aes128-ctr
DEBUG: (local) Client->Server HMAC algorithms: hmac-sha2-256,hmac-sha2-512,hmac-sha1
DEBUG: (remote) Client->Server HMAC algorithms: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
DEBUG: Client->Server HMAC algorithm: hmac-sha2-256
DEBUG: (local) Server->Client HMAC algorithms: hmac-sha2-256,hmac-sha2-512,hmac-sha1
DEBUG: (remote) Server->Client HMAC algorithms: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
DEBUG: Server->Client HMAC algorithm: hmac-sha2-256
DEBUG: (local) Client->Server compression algorithms: none,zlib@openssh.com,zlib
DEBUG: (remote) Client->Server compression algorithms: none,zlib@openssh.com
DEBUG: Client->Server compression algorithm: none
DEBUG: (local) Server->Client compression algorithms: none,zlib@openssh.com,zlib
DEBUG: (remote) Server->Client compression algorithms: none,zlib@openssh.com
DEBUG: Server->Client compression algorithm: none
DEBUG: Outgoing: Writing KEXECDH_INIT
DEBUG: Parser: IN_PACKETBEFORE (expecting 8)

and again

C:\nodejsprojs\ssh-test2>node test.js
DEBUG: Local ident: 'SSH-2.0-ssh2js0.4.10'
DEBUG: Client: Trying 15.188.191.179 on port 22 ...
DEBUG: Client: Connected
DEBUG: Parser: IN_INIT
DEBUG: Parser: IN_GREETING
DEBUG: Parser: IN_HEADER
DEBUG: Remote ident: 'SSH-2.0-OpenSSH_7.6p1 Ubuntu-4'
DEBUG: Outgoing: Writing KEXINIT
mscdex commented 3 years ago

I'm not sure I understand the situation. Are you changing IPs (on the client?) while connecting to an ssh server or ?

RunsetTech commented 3 years ago

no i'm just create the instance then attach the static ip, after 120 secs (this time is for ensuring that server is started) i start to connect to them one by one

GeniusLuo commented 3 years ago

I do have the some problem, just update to v1.0.0 from v0.8.9 here is my code:

const { Client } = require('ssh2');
const conn = new Client();
conn.on('ready', () => {
console.log('Client :: ready');
}).connect({
host: '10.10.xxx.xx',
port: 22,
username: 'xxx',
password: 'xxxx',
});

the error is:

events.js:291
throw er; // Unhandled 'error' event
^
Error: Timed out while waiting for handshake
at Timeout._onTimeout (C:\Users\sz_syit249\Desktop\node\node_modules\ssh2\lib\client.js:993:23)
at listOnTimeout (internal/timers.js:554:17)
at processTimers (internal/timers.js:497:7)
Emitted 'error' event on Client instance at:
at Timeout._onTimeout (C:\Users\sz_syit249\Desktop\node\node_modules\ssh2\lib\client.js:995:16)
at listOnTimeout (internal/timers.js:554:17)
at processTimers (internal/timers.js:497:7) {
level: 'client-timeout'
}
mscdex commented 3 years ago

@GeniusLuo what does the debug output show?

mscdex commented 2 years ago

@GeniusLuo can you try the master branch and see if that improves the situation?

Humphaz commented 2 years ago

This is still an issue. I've raised it again here: https://github.com/theophilusx/ssh2-sftp-client/issues/351

But it looks like I should have been raising it here.

This seems to be the prime calculation, its locking the server for up to 20 seconds while it calculates.

Can this not pre-calculate or something, either as part of the install or initialisation process?

I haven't gone into the SSH code yet to see if I can see where its going slow etc. but happy to be a tester.

mscdex commented 2 years ago

@Humphaz That is unrelated to the original issue here. The slowdown with diffie-hellman in general is in OpenSSL since many OpenSSL versions ago when they started performing additional checks on user-supplied primes for security reasons. There is no "pre-calcuation" that can be done for this (in fact, that'd be more or less like the fixed group diffie-hellman methods).

Your best bet to avoid slowdowns is to use curve25519 where available (which this module currently prioritizes by default) as that doesn't trigger the same kind of checks within OpenSSL.

Humphaz commented 2 years ago

Thanks mscdex, this isn't my comfort zone. 😃

My log looks like this for the handshake:

Handshake: (local) KEX method: ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha1 Handshake: (remote) KEX method: diffie-hellman-group16-sha512,diffie-hellman-group14-sha256,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha1,diffie-hellman-group-exchange-sha1,diffie-hellman-group1-sha1 Handshake: KEX algorithm: diffie-hellman-group-exchange-sha256 Handshake: (local) Host key format: ssh-rsa,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521 Handshake: (remote) Host key format: ssh-rsa Handshake: Host key format: ssh-rsa Handshake: (local) C->S cipher: aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm,aes128-gcm@openssh.com,aes256-gcm,aes256-gcm@openssh.com Handshake: (remote) C->S cipher: aes256-cbc,aes256-ctr,3des-cbc,aes128-cbc,aes128-ctr Handshake: C->S Cipher: aes128-ctr Handshake: (local) S->C cipher: aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm,aes128-gcm@openssh.com,aes256-gcm,aes256-gcm@openssh.com Handshake: (remote) S->C cipher: aes256-cbc,aes256-ctr,3des-cbc,aes128-cbc,aes128-ctr Handshake: S->C cipher: aes128-ctr Handshake: (local) C->S MAC: hmac-sha2-256,hmac-sha2-512,hmac-sha1 Handshake: (remote) C->S MAC: hmac-sha2-512,hmac-sha2-256,hmac-sha1,hmac-md5,hmac-sha1-96,hmac-md5-96 Handshake: C->S MAC: hmac-sha2-256 Handshake: (local) S->C MAC: hmac-sha2-256,hmac-sha2-512,hmac-sha1 Handshake: (remote) S->C MAC: hmac-sha2-512,hmac-sha2-256,hmac-sha1,hmac-md5,hmac-sha1-96,hmac-md5-96 Handshake: S->C MAC: hmac-sha2-256 Handshake: (local) C->S compression: none,zlib@openssh.com,zlib Handshake: (remote) C->S compression: zlib,none Handshake: C->S compression: none Handshake: (local) S->C compression: none,zlib@openssh.com,zlib Handshake: (remote) S->C compression: zlib,none Handshake: S->C compression: none

So I guess it is choosing:

Handshake: KEX algorithm: diffie-hellman-group-exchange-sha256 Handshake: Host key format: ssh-rsa Handshake: C->S Cipher: aes128-ctr Handshake: S->C cipher: aes128-ctr Handshake: C->S MAC: hmac-sha2-256 Handshake: S->C MAC: hmac-sha2-256 Handshake: C->S compression: none Handshake: S->C compression: none

So which one is it in the remote set?

Handshake: (remote) KEX method: diffie-hellman-group16-sha512,diffie-hellman-group14-sha256,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha1,diffie-hellman-group-exchange-sha1,diffie-hellman-group1-sha1

Is this actually already choosing the one you recommended?

Humphaz commented 2 years ago

Oh. I see, it is the ecdh-sha2-nistp256 and it isn't supported by the remote server.

😢

Humphaz commented 2 years ago

Ok, so I've used SFTP from the linux command line, it connects immediately and its using all of the same algortithms?

How come this is instantaneous and yet, it takes 20 seconds to do it through code, obviously I know JS is a lot slower than C++, but it seems a bit excessive?

Sorry if I'm making a total noob error here, but surely there is something that can be done?

mscdex commented 2 years ago

@Humphaz Two possibilities:

  1. OpenSSH can/does use their own (non-OpenSSL) crypto-related code
  2. They may be only be using the basics of OpenSSL (e.g. bignum support) or using different OpenSSL APIs to perform the various calculations

If you really want to know the answer you'd have to dig into the OpenSSH code and compare it with node's crypto code.

mscdex commented 2 years ago

@Humphaz If you're interested in the OpenSSL commit that changed things, it's here, so it's most likely because of node's use of DH_check() whereas OpenSSH does not appear to use it.

Humphaz commented 2 years ago

I'm beginning to wonder if it's because it's offered options. I'm going to try and only offer the one used and see if that helps.

Thank you for taking the time out to look at this.

I will let you know how I get on later today.

Does that mean if I was just wanting hack to get around the issue for solving later, I could just miss out the check?

mscdex commented 2 years ago

@Humphaz The DH_check() happens in node core, so you would need to make the change there. However, it's not really advisable as it could open you up to security issues. The main problem is OpenSSL's DH_check() does not provide a way to specify which checks to perform. If they did and node publicly exposed this in some way, then it would be possible to bypass some of the arguably overly strong checks being performed.

dhensby commented 2 years ago

I've run into this problem too. Having read the bug report in the node, the fix is essentially use webcrypto in node >= 15 and suck it up for node <15.

Having looked into some of the KEX code in this library it looks like moving to webcrypto (async based) is going to be a fairly big re-write of the KEX logic as a lot of it relies on sync code. This is also partly causing problems for us when we want to work at scale as the key generation steps are sync and blocking, especially the DH key generation which causes a 20 second block of the event loop and a massive CPU spike. This ends up locking all the other node "threads" while this 20 second key generation happens.

For now, the solution is to make sure the upstream servers support EC based key exchanges, but this isn't always practical as we aren't the administrators of these servers.

@mscdex are there any plans to move to an async based KEX? Would contributions to this end be welcome?