kz8s / tack

Terraform module for creating Kubernetes cluster running on Container Linux by CoreOS in an AWS VPC
MIT License
719 stars 145 forks source link

TLS handshake error and Too many authentication failures #61

Closed nkhine closed 7 years ago

nkhine commented 8 years ago

Hi, I have setup a k8s cluster using tack, all worked well, until I started to test killing EC2 machines and for these being recreated in that now in my logs I am getting these errors:

Sep 06 19:56:29 ip-10-0-0-10 k8s_kube-apiserver.2f8103e3_kube-apiserver-ip-10-0-0-10.ec2.inte: I0906 19:56:29.379753       1 logs.go:41] http: TLS handshake error from 10.0.2.6:40824: EOF

I can login into the bastion machine but I am no longer able to login to none of the etcd nor worker machines.

The only way to access the bastion machine is by typing:

➜  tack git:(master) ssh -i .keypair/k8s.pem -A core@`terraform output bastion-ip`
CoreOS stable (1068.10.0)
Last login: Thu Sep  8 00:28:48 2016 from 83.216.95.6
core@ip-10-0-0-166 ~ $ 

or if i try to login to one of the etcd machines I get:

➜  tack git:(master) ssh -i .keypair/k8s.pem -T core@`terraform output bastion-ip` ssh core@10.0.0.10
Pseudo-terminal will not be allocated because stdin is not a terminal.
Permission denied, please try again.
Permission denied, please try again.
Received disconnect from 10.0.0.10 port 22:2: Too many authentication failures
packet_write_wait: Connection to 10.0.0.10 port 22: Broken pipe

What seems to be the issue and is there a way to fix it?

Also what is the correct way to modify the ssh port from 22 to something else?

Any advice is much appreciated.

mirthy commented 8 years ago

It's possible you have too many keys in your ssh agent. Try ssh-add -D to remove them all and then add the keys you need.

nkhine commented 8 years ago

Would this be on my local client or the bastion machine?

mirthy commented 8 years ago

Local client

nkhine commented 8 years ago

I have removed all the keys and tried again, but I am not able to login. The same happened on a new cluster i setup couple of days ago - initially I was able to login, but after a while the ssh port 22 is blocked with the above error.

Although I did run a terraform plan after I had created the cluster, would this have changed the .pem files?

guiocavalcanti commented 8 years ago

@nkhine try to connect using:

ssh -i .keypair/k8s.pem -o 'IdentitiesOnly yes' -T core@`terraform output bastion-ip` ssh core@10.0.0.10 -v
nkhine commented 8 years ago
➜  tack git:(master) ssh -i .keypair/k8s.pem -o 'IdentitiesOnly yes' -T core@`terraform output bastion-ip` ssh core@10.0.0.10 -v
OpenSSH_7.2p2, OpenSSL 1.0.2h  3 May 2016
debug1: Reading configuration data /etc/ssh/ssh_config
Pseudo-terminal will not be allocated because stdin is not a terminal.
debug1: Connecting to 10.0.0.10 [10.0.0.10] port 22.
debug1: Connection established.
debug1: key_load_public: No such file or directory
debug1: identity file /home/core/.ssh/id_rsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/core/.ssh/id_rsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/core/.ssh/id_dsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/core/.ssh/id_dsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/core/.ssh/id_ecdsa type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/core/.ssh/id_ecdsa-cert type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/core/.ssh/id_ed25519 type -1
debug1: key_load_public: No such file or directory
debug1: identity file /home/core/.ssh/id_ed25519-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_7.2
debug1: Remote protocol version 2.0, remote software version OpenSSH_7.2
debug1: match: OpenSSH_7.2 pat OpenSSH* compat 0x04000000
debug1: Authenticating to 10.0.0.10:22 as 'core'
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: curve25519-sha256@libssh.org
debug1: kex: host key algorithm: ecdsa-sha2-nistp256
debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ecdsa-sha2-nistp256 SHA256:URr4b2quuz2Uu1N4Hh4fpWFxqbFO91m3O8VB+5zbXp4
debug1: Host '10.0.0.10' is known and matches the ECDSA host key.
debug1: Found key in /home/core/.ssh/known_hosts:1
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_EXT_INFO received
debug1: kex_input_ext_info: server-sig-algs=<rsa-sha2-256,rsa-sha2-512>
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,password,keyboard-interactive
debug1: Next authentication method: publickey
debug1: Trying private key: /home/core/.ssh/id_rsa
debug1: Trying private key: /home/core/.ssh/id_dsa
debug1: Trying private key: /home/core/.ssh/id_ecdsa
debug1: Trying private key: /home/core/.ssh/id_ed25519
debug1: Next authentication method: keyboard-interactive
debug1: read_passphrase: can't open /dev/tty: No such device or address
debug1: Authentications that can continue: publickey,password,keyboard-interactive
debug1: read_passphrase: can't open /dev/tty: No such device or address
debug1: Authentications that can continue: publickey,password,keyboard-interactive
debug1: read_passphrase: can't open /dev/tty: No such device or address
debug1: Authentications that can continue: publickey,password,keyboard-interactive
debug1: Next authentication method: password
debug1: read_passphrase: can't open /dev/tty: No such device or address
debug1: Authentications that can continue: publickey,password,keyboard-interactive
Permission denied, please try again.
debug1: read_passphrase: can't open /dev/tty: No such device or address
debug1: Authentications that can continue: publickey,password,keyboard-interactive
Permission denied, please try again.
debug1: read_passphrase: can't open /dev/tty: No such device or address
Received disconnect from 10.0.0.10 port 22:2: Too many authentication failures
debug1: Authentication succeeded (password).
Authenticated to 10.0.0.10 ([10.0.0.10]:22).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: pledge: network
packet_write_wait: Connection to 10.0.0.10 port 22: Broken pipe
nkhine commented 8 years ago

in my logs i see:

Oct 01 00:51:00 ip-10-0-0-12 k8s_kube-apiserver.2f8103e3_kube-apiserver-ip-10-0-0-12.ec2.inte: I1001 00:51:00.218195       1 logs.go:41] http: TLS handshake error from 10.0.2.193:42576: tls: client offered an unsupported, maximum protocol version of 301 

Oct 01 02:34:36 ip-10-0-0-12 k8s_kube-apiserver.2f8103e3_kube-apiserver-ip-10-0-0-12.ec2.inte: I1001 02:34:36.112906       1 logs.go:41] http: TLS handshake error from 10.0.2.193:44146: tls: no cipher suite supported by both client and server 

Oct 01 03:27:49 ip-10-0-0-11 k8s_kube-apiserver.2f8103e3_kube-apiserver-ip-10-0-0-11.ec2.inte: I1001 03:27:49.086334       1 logs.go:41] http: TLS handshake error from 10.0.0.91:19279: EOF 
wellsie commented 8 years ago

77 changes the ssh behavior so this should no longer be a problem.

nkhine commented 8 years ago

thanks

nkhine commented 8 years ago

@wellsie i have pulled the latest master and installed a new cluster, i am able to login to the bastion machine but not able to login to any of the other machines. in the logs i get these:

Oct 17 21:49:19 ip-10-0-10-10 k8s_kube-apiserver.63910579_kube-apiserver-ip-10-0-10-10.ec2.int: I1017 21:49:19.935343       1 logs.go:41] http: TLS handshake error from 10.0.2.111:41310: tls: oversized record received with length 51927 

Oct 18 04:47:12 ip-10-0-10-11 k8s_kube-apiserver.63910579_kube-apiserver-ip-10-0-10-11.ec2.int: I1018 04:47:12.024452       1 logs.go:41] http: TLS handshake error from 10.0.2.111:34894: tls: client offered an unsupported, maximum protocol version of 300 

Oct 18 04:48:25 ip-10-0-10-10 k8s_kube-apiserver.63910579_kube-apiserver-ip-10-0-10-10.ec2.int: I1018 04:48:25.633079       1 logs.go:41] http: TLS handshake error from 10.0.2.111:47172: tls: no cipher suite supported by both client and server 

Oct 18 07:14:54 ip-10-0-10-12 k8s_kube-apiserver.63910579_kube-apiserver-ip-10-0-10-12.ec2.int: I1018 07:14:54.371235       1 logs.go:41] http: TLS handshake error from 10.0.0.94:48699: EOF 
Oct 18 07:14:55 ip-10-0-10-11 k8s_kube-apiserver.63910579_kube-apiserver-ip-10-0-10-11.ec2.int: I1018 07:14:55.729419       1 logs.go:41] http: TLS handshake error from 10.0.0.94:17311: EOF 
wellsie commented 8 years ago

does make ssh work ?

nkhine commented 8 years ago

yes it works

➜  tack git:(master) ✗ make ssh                      (git)-[master] 
Agent pid 5108
Identity added: .keypair/kz8s-test.pem (.keypair/kz8s-test.pem)
CoreOS stable (1122.2.0)
Last login: Wed Oct 19 07:50:09 2016 from 10.0.0.167
core@ip-10-0-10-10 ~ $ 
nkhine commented 8 years ago

but what about the warnings i get in the logs?

jaigouk commented 8 years ago

for me, it was secruity group. https://forums.aws.amazon.com/thread.jspa?threadID=66813. I specified the source address or 0.0.0.0/0 to be able to connect to this instance from anywhere. I missed it.

wellsie commented 8 years ago

@nkhine - are you still seeing this ? I cannot repro.

nkhine commented 8 years ago

@wellsie yes, i just installed a new cluster based on the latest code and am still seeing these:

Oct 25 11:42:37 ip-10-8-10-12 k8s_kube-apiserver.7be8057c_kube-apiserver-ip-10-8-10-12.ec2.int: I1025 11:42:37.397978       1 logs.go:41] http: TLS handshake error from 10.8.0.54:21975: EOF 

10.8.0.54:21975 does not exist on my cluster