gravitational / teleport

The easiest, and most secure way to access and protect all of your infrastructure.
https://goteleport.com
GNU Affero General Public License v3.0
17.09k stars 1.72k forks source link

Access agentless nodes using hostname with `ssh` as client #31281

Closed marcoandredinis closed 6 months ago

marcoandredinis commented 11 months ago

Expected behavior: Doing tsh ssh <user>@<host> should be the same as doing ssh <user>@<host>.<cluster> after correctly configuring ssh client.

Current behavior:tsh ssh root@ip-172-31-24-131

ssh root@ip-172-31-24-131.lenix

> ssh root@ip-172-31-24-131.lenix
ERROR: ssh: subsystem request failed

kex_exchange_identification: Connection closed by remote host
Connection closed by UNKNOWN port 65535

After doing tctl nodes ls we get the node's name and try ssh client again: ✅ ssh root@7e0e8d6a-a6b9-45a5-85de-540dc9527467.lenix

Bug details:

This is also valid for remote clusters Trying to access a Node in a Agentless Node using ssh fails

r0mant commented 11 months ago

@lxea @capnspacehook Shouldn't agentless mode support dialing by hostname? Is there a bug?

r0mant commented 11 months ago

@marcoandredinis What happens if you try to connect using ssh root@ip-172-31-24-131?

marcoandredinis commented 11 months ago

@marcoandredinis What happens if you try to connect using ssh root@ip-172-31-24-131?

I'll try that as well But my expectation is that it won't match the regex* we define in ssh config and will try to access the ip-172-31-24-131 node.

Host *.{{ .ClusterName }} !{{ .ProxyHost }}
    ... tsh proxy ssh ...
marcoandredinis commented 10 months ago

New set up (just to ensure I didn't messed up previously, or I did it twice :sweat_smile: )

Cluster lenixleaf - SSH node with Teleport Agent mynode - SSH Node in Agentless Mode (added using teleport join openssh)

dinis@lenix ~/p/ssh-combinations (master)> tsh ssh dinis@lenixleaf exit
dinis@lenix ~/p/ssh-combinations (master)> ssh dinis@lenixleaf.lenixleaf exit
dinis@lenix ~/p/ssh-combinations (master)> tsh ssh root@mynode exit
dinis@lenix ~/p/ssh-combinations (master)> ssh root@mynode.lenixleaf exit
ERROR: ssh: subsystem request failed

kex_exchange_identification: Connection closed by remote host
Connection closed by UNKNOWN port 65535
dinis@lenix ~/p/ssh-combinations (master) [255]> ssh root@mynode exit
ssh: Could not resolve hostname mynode: Name or service not known

Logs when I try to connect to connect using ssh root@mynode.lenixleaf exit

2023-09-01T11:03:43+01:00 DEBU [PROXY]     conn(127.0.0.1:36156->127.0.0.2:13080, user=root) auth attempt fingerprint:ssh-rsa-cert-v01@openssh.com SHA256:iBWTODZvOQ1c8mz1FcN281TlNP1DT9q+A/sa0mLisLM local:127.0.0.2:13080 remote:127.0.0.1:36156 user:root srv/authhandlers.go:304
2023-09-01T11:03:43+01:00 DEBU [PROXY]     conn(127.0.0.1:36156->127.0.0.2:13080, user=root) auth attempt with key ssh-rsa-cert-v01@openssh.com SHA256:iBWTODZvOQ1c8mz1FcN281TlNP1DT9q+A/sa0mLisLM, &ssh.Certificate{Nonce:[]uint8{0xe1, 0x73, 0x24, 0xa7, 0xa7, 0xbe, 0xe2, 0xee, 0xb0, 0x81, 0x47, 0x77, 0x24, 0x6d, 0x18, 0xf6, 0xfc, 0x7, 0x7, 0x6f, 0x75, 0xfe, 0xa7, 0xc5, 0x91, 0x78, 0xed, 0xb1, 0xbc, 0xe5, 0x2e, 0x95}, Key:(*ssh.rsaPublicKey)(0xc003d889d0), Serial:0x0, CertType:0x1, KeyId:"marco", ValidPrincipals:[]string{"root", "dinis", "-teleport-internal-join"}, ValidAfter:0x64f1b518, ValidBefore:0x64f25e14, Permissions:ssh.Permissions{CriticalOptions:map[string]string{}, Extensions:map[string]string{"login-ip":"127.0.0.1", "permit-agent-forwarding":"", "permit-port-forwarding":"", "permit-pty":"", "private-key-policy":"none", "teleport-roles":"{\"version\":\"v1\",\"roles\":[\"editor\",\"access\"]}", "teleport-route-to-cluster":"lenixleaf", "teleport-traits":"{\"logins\":[\"root\",\"dinis\"]}"}}, Reserved:[]uint8{}, SignatureKey:(*ssh.rsaPublicKey)(0xc003d88a10), Signature:(*ssh.Signature)(0xc004ae3180)} fingerprint:ssh-rsa-cert-v01@openssh.com SHA256:iBWTODZvOQ1c8mz1FcN281TlNP1DT9q+A/sa0mLisLM local:127.0.0.2:13080 remote:127.0.0.1:36156 user:root srv/authhandlers.go:307
2023-09-01T11:03:43+01:00 DEBU [PROXY]     Successfully authenticated fingerprint:ssh-rsa-cert-v01@openssh.com SHA256:iBWTODZvOQ1c8mz1FcN281TlNP1DT9q+A/sa0mLisLM local:127.0.0.2:13080 remote:127.0.0.1:36156 user:root srv/authhandlers.go:394
2023-09-01T11:03:43+01:00 DEBU [SSH:PROXY] Incoming connection 127.0.0.1:36156 -> 127.0.0.2:13080 version: SSH-2.0-Go, certtype: "user" sshutils/server.go:500
2023-09-01T11:03:43+01:00 DEBU [PROXY]     Handling request auth-agent-req@openssh.com, want reply true. id:21 local:127.0.0.2:13080 login:root remote:127.0.0.1:36156 teleportUser:marco regular/sshserver.go:1576
2023-09-01T11:03:43+01:00 DEBU [KEEPALIVE] Starting keep-alive loop with interval 5m0s and max count 3. srv/keepalive.go:64
2023-09-01T11:03:43+01:00 WARN             agent forwarding to proxy only supported in recording mode regular/sshserver.go:1595
2023-09-01T11:03:43+01:00 DEBU [PROXY]     Handling request subsystem, want reply true. id:21 local:127.0.0.2:13080 login:root remote:127.0.0.1:36156 teleportUser:marco regular/sshserver.go:1576
2023-09-01T11:03:43+01:00 DEBU [NODE]      parse_proxy_subsys("proxy:mynode:3022@lenixleaf") regular/proxy.go:70
2023-09-01T11:03:43+01:00 DEBU [NODE]      newProxySubsys({default mynode 3022 lenixleaf}). regular/proxy.go:182
2023-09-01T11:03:43+01:00 DEBU [PROXY]     Subsystem request: proxySubsys(cluster=default/lenixleaf, host=mynode, port=3022). id:21 local:127.0.0.2:13080 login:root remote:127.0.0.1:36156 teleportUser:marco regular/sshserver.go:1837
2023-09-01T11:03:43+01:00 DEBU [SUBSYSTEM] Starting subsystem trace.fields:map[dst:127.0.0.2:13080 src:127.0.0.1:36156 subsystem:proxySubsys(cluster=default/lenixleaf, host=mynode, port=3022)] regular/proxy.go:214
2023-09-01T11:03:43+01:00 DEBU [SUBSYSTEM] proxy connecting to host=mynode port=3022, exact port=true trace.fields:map[dst:127.0.0.2:13080 src:127.0.0.1:36156 subsystem:proxySubsys(cluster=default/lenixleaf, host=mynode, port=3022)] regular/proxy.go:247
2023-09-01T11:03:43+01:00 WARN [PROXY]     Subsystem request proxySubsys(cluster=default/lenixleaf, host=mynode, port=3022) failed: direct dialing to nodes not found in inventory is not supported. id:21 local:127.0.0.2:13080 login:root remote:127.0.0.1:36156 teleportUser:marco regular/sshserver.go:1841
2023-09-01T11:03:43+01:00 ERRO             failure handling SSH "subsystem" request error:[
ERROR REPORT:
Original Error: *trace.ConnectionProblemError direct dialing to nodes not found in inventory is not supported
Stack Trace:
    github.com/gravitational/teleport/lib/proxy/router.go:288 github.com/gravitational/teleport/lib/proxy.(*Router).DialHost
    github.com/gravitational/teleport/lib/srv/regular/proxy.go:257 github.com/gravitational/teleport/lib/srv/regular.(*proxySubsys).proxyToHost
    github.com/gravitational/teleport/lib/srv/regular/proxy.go:224 github.com/gravitational/teleport/lib/srv/regular.(*proxySubsys).Start
    github.com/gravitational/teleport/lib/srv/regular/sshserver.go:1840 github.com/gravitational/teleport/lib/srv/regular.(*Server).handleSubsystem
    github.com/gravitational/teleport/lib/srv/regular/sshserver.go:1585 github.com/gravitational/teleport/lib/srv/regular.(*Server).dispatch
    github.com/gravitational/teleport/lib/srv/regular/sshserver.go:1544 github.com/gravitational/teleport/lib/srv/regular.(*Server).handleSessionRequests
    runtime/asm_amd64.s:1650 runtime.goexit
User Message: direct dialing to nodes not found in inventory is not supported] regular/sshserver.go:2051
2023-09-01T11:03:43+01:00 DEBU [PROXY]     Releasing associated resources - context has been closed. id:21 local:127.0.0.2:13080 login:root remote:127.0.0.1:36156 teleportUser:marco srv/monitor.go:397
2023-09-01T11:03:43+01:00 DEBU [SSH:PROXY] Closed connection 127.0.0.1:36156. sshutils/server.go:505

I edited the ~/.ssh/config to add --debug flag to tsh proxy ssh and got the following logs

2023-09-01T11:05:05+01:00 INFO [CLIENT]    ALPN connection upgrade required for "127.0.0.2.nip.io:13080": false. client/api.go:723
2023-09-01T11:05:05+01:00 ERRO [CLIENT]    [KEY AGENT] Unable to connect to SSH agent on socket: "". error:[
ERROR REPORT:
Original Error: *net.OpError dial unix: missing address
Stack Trace:
    github.com/gravitational/teleport/lib/utils/agentconn/agent_unix.go:32 github.com/gravitational/teleport/lib/utils/agentconn.Dial
    github.com/gravitational/teleport/lib/client/api.go:4552 github.com/gravitational/teleport/lib/client.connectToSSHAgent
    github.com/gravitational/teleport/lib/client/keyagent.go:135 github.com/gravitational/teleport/lib/client.NewLocalAgent
    github.com/gravitational/teleport/lib/client/api.go:1115 github.com/gravitational/teleport/lib/client.NewClient
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:3396 github.com/gravitational/teleport/tool/tsh/common.makeClientForProxy
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:3381 github.com/gravitational/teleport/tool/tsh/common.makeClient
    github.com/gravitational/teleport/tool/tsh/common/proxy.go:63 github.com/gravitational/teleport/tool/tsh/common.onProxyCommandSSH
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:1360 github.com/gravitational/teleport/tool/tsh/common.Run
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:540 github.com/gravitational/teleport/tool/tsh/common.Main
    github.com/gravitational/teleport/tool/tsh/main.go:24 main.main
    runtime/proc.go:267 runtime.main
    runtime/asm_amd64.s:1650 runtime.goexit
User Message: dial unix: missing address] client/api.go:4554
2023-09-01T11:05:05+01:00 DEBU [KEYSTORE]  Reading certificates from path "/home/dinis/.tsh/keys/127.0.0.2.nip.io/marco-ssh/lenixleaf-cert.pub". client/keystore.go:354
2023-09-01T11:05:05+01:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2023-09-01 21:56:36 +0000 UTC". client/client_store.go:106
2023-09-01T11:05:05+01:00 INFO [KEYAGENT]  Loading SSH key for user "marco" and cluster "lenixleaf". client/keyagent.go:196
2023-09-01T11:05:05+01:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2023-09-01 21:56:36 +0000 UTC". client/client_store.go:106
2023-09-01T11:05:05+01:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2023-09-01 21:56:36 +0000 UTC". client/client_store.go:106
2023-09-01T11:05:05+01:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2023-09-01 21:56:36 +0000 UTC". client/client_store.go:106
2023-09-01T11:05:05+01:00 DEBU [KEYAGENT]  "Checking key: ssh-rsa-cert-v01@openssh.com AAAAHHNzaC1yc2EtY2VydC12MDFAb3BlbnNzaC5jb20AAAAg0UtEXmvYvs2NnB66ZyU6R10PgtohjgMYsR5adyk+dgYAAAADAQABAAABAQDtMIWU2UNgE/Zwb7f2GViAXLWk+23ptAojLC/aGRd4KxHmdKBlZOW1Li4qrMza0EFTpbx9N1LG1nffAtcLMDhv/zsiMpzi+9ZrKNnpsx4VUYPCdK+fgCoW6cHcgeggmKPMz/t/bE5uTBfHmOY2Kl/9AAje4RFis/l0KSsjxyBhhl4PvC2zU3osv9v2xCj1ImcZTKVBdWzMhzjq0C9JkMvHdc2M91Ny8bEyBy8NhpaRvkhtTzfcFYp7aas2sXyP75Ux8FuUMpykglpop9+y+/CNeIxy+/Quduedqkh3qFDlobp13jlmErkwsPvHvuYwvsNyi+Ln3yUJAAEvHq+RxhvXAAAAAAAAAAAAAAACAAAAAAAAAN8AAAAuODdlZDUyNzEtNGEwNy00MDMwLWE5MjUtNjgyZTk1YzAyNTQ2Lmxlbml4bGVhZgAAACQ4N2VkNTI3MS00YTA3LTQwMzAtYTkyNS02ODJlOTVjMDI1NDYAAAATbGVuaXhsZWFmLmxlbml4bGVhZgAAAAlsZW5peGxlYWYAAAAJbG9jYWxob3N0AAAACTEyNy4wLjAuMQAAAAM6OjEAAAAQMTI3LjAuMC4yLm5pcC5pbwAAAChyZW1vdGUua3ViZS5wcm94eS50ZWxlcG9ydC5jbHVzdGVyLmxvY2FsAAAAAGTxs4T//////////wAAAAAAAABJAAAAFHgtdGVsZXBvcnQtYXV0aG9yaXR5AAAADQAAAAlsZW5peGxlYWYAAAAPeC10ZWxlcG9ydC1yb2xlAAAACQAAAAVQcm94eQAAAAAAAAEXAAAAB3NzaC1yc2EAAAADAQABAAABAQCvHgRzW8LYczdnEvJUNyDK8eN0ojMfNxeB04piR11FfiMPuOXgaM6fWUTxgDpiv1kgWBRUlYPdG/tM/uWuo556VYCStT1MIejCF+dAKflwsvMRlPQVhebOIZplZkwHNbb2edtjrd24lcCWy2N9nxdef2IUjkz4aebzKNkLZT6ToR/KtgTlZgTR6A1Ovz5NaAAQZ7VG1sZKhy8EjDsLNnJ47pYERq2Wv4wkC7+Z6fbHsFnf8e+Y2hm/qyd4aMIKCGmic/V0O/ovYbRJi5gju+dukaUnlA28HNCL/ADvcM2JZlS5EJZR6qPjvcFzn+TvK0JXWwh7MiZ54R+xZ+aKbpmtAAABFAAAAAxyc2Etc2hhMi01MTIAAAEAG6R6DHqF8dD3es1DXF6rTUQ5zu1W6z6zYAfHK3+IR9jc6lSHDYAsdKj3ohPOxz8zc7yNBORAg2ej1JurXIv5GbKXFB8gvYsFlWQnp91Qad0Jn3NzU0xMYuY36DcoCSJaAu/JlQkUpFyMPCZpZu8pu/t+XI1462DHH1nwQWNs26mVCq55WZ5yUyLWt0YgTagETID0GhtKmSfWIQ/mTbJGS0vQ25G04lyelCgnUPDqi2DDZlIgse4tUL8H5h8dUs0gtYrSyDyGO7gRR91Sn7cSEhVrLLnYqDHbrCtPuTKu3K/G3gdv4jIz3L7+iAG748+JAv5LSdZa0lG/JHPyfq4YvA==\n." client/keyagent.go:368
2023-09-01T11:05:05+01:00 DEBU [KEYAGENT]  Validated host 127.0.0.2.nip.io:13080. client/keyagent.go:374

ERROR REPORT:
Original Error: *errors.errorString ssh: subsystem request failed
Stack Trace:
    github.com/gravitational/teleport/api@v0.0.0/observability/tracing/ssh/session.go:183 github.com/gravitational/teleport/api/observability/tracing/ssh.(*Session).RequestSubsystem
    github.com/gravitational/teleport/tool/tsh/common/proxy.go:228 github.com/gravitational/teleport/tool/tsh/common.sshProxy
    github.com/gravitational/teleport/tool/tsh/common/proxy.go:80 github.com/gravitational/teleport/tool/tsh/common.onProxyCommandSSH.func1
    github.com/gravitational/teleport/lib/client/api.go:570 github.com/gravitational/teleport/lib/client.RetryWithRelogin
    github.com/gravitational/teleport/tool/tsh/common/proxy.go:68 github.com/gravitational/teleport/tool/tsh/common.onProxyCommandSSH
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:1360 github.com/gravitational/teleport/tool/tsh/common.Run
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:540 github.com/gravitational/teleport/tool/tsh/common.Main
    github.com/gravitational/teleport/tool/tsh/main.go:24 main.main
    runtime/proc.go:267 runtime.main
    runtime/asm_amd64.s:1650 runtime.goexit
User Message: ssh: subsystem request failed
kex_exchange_identification: Connection closed by remote host
Connection closed by UNKNOWN port 65535
marcoandredinis commented 10 months ago

The same happens when accessing the Nodes in a leaf cluster, using a login in root cluster For Teleport Nodes I can connect using WebUI, tsh and ssh For Agentless Nodes I can only connect using WebUI, tsh but not ssh

lxea commented 10 months ago

I was able to connect to an agentless leaf node by passing an explicit port to tsh --

$  ssh -F ~/ssh_config_teleport -p 22 agentless_node.lan.leafcluster

However subsequent re-connections says the remote host key has changed for some reason

$ ssh -F ~/ssh_config_teleport -p 22 alex@suse.lan.weareacluster
The authenticity of host 'agentless_node.lan.leafcluster (<no hostip for proxy command>)' can't be established.
RSA key fingerprint is SHA256:Qy+aE9WQa63W7GomQv1f3nE78dgkcNJm3/i7PH36RqE.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'agentless_node.lan.leafcluster' (RSA) to the list of known hosts.
Last login: Fri Sep  1 11:44:19 IST 2023 from 192.168.122.24 on ssh
Have a lot of fun...
alex@suse:~> logout
Connection to agentless_node.lan.leafcluster closed.
alex@corn ~
$ ssh -F ~/ssh_config_teleport -p 22 agentless_node.lan.leafcluster
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
SHA256:dmDmF9gfrPXdTuH7hILRYujJ9VEqPIqtYpalCrhuCjc.
Please contact your system administrator.
Add correct host key in /home/alex/.tsh/known_hosts to get rid of this message.
Offending RSA key in /home/alex/.tsh/known_hosts:3
Host key for agentless_node.lan.leafcluster has changed and you have requested strict checking.
Host key verification failed.
marcoandredinis commented 10 months ago

I can confirm the exact same behaviour :+1: Passing the host's port works, but the 2nd access fails (same error you got)

lxea commented 10 months ago

I can connect okay when using an agentless node on a local cluster like ssh -F config_file -p 22 alex@node.localcluster

The port being required to be passed seems to be expected acording to the guide https://goteleport.com/docs/server-access/guides/openssh/#step-33-connect-to-your-sshd-host however when connecting to a node on a leaf cluster you seem to be expected to include .leafcluster.rootcluster however this seems to give the same error, and pass -p 22 also doesnt work with it

r0mant commented 10 months ago

@marcoandredinis @lxea @capnspacehook Is this a regression in behavior compared to v13 and v12?

marcoandredinis commented 10 months ago

I'm getting the same thing using v13:

dinis@lenix ~/p/ssh-combinations (master)> tsh version
Teleport v13.3.7 git:api/v13.3.7-37-gdb1dc70117 go1.21.0
Proxy version: 13.3.7
Proxy: 127.0.0.2.nip.io:13080
dinis@lenix ~/p/ssh-combinations (master)> ssh -F ssh_config_teleport root@mynode.lenixleaf
ERROR: ssh: subsystem request failed

kex_exchange_identification: Connection closed by remote host
Connection closed by UNKNOWN port 65535
dinis@lenix ~/p/ssh-combinations (master) [255]> ssh -F ssh_config_teleport root@mynode.lenixleaf -p 4444
Welcome to Ubuntu 22.04.3 LTS (GNU/Linux 6.5.1 x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

This system has been minimized by removing packages and content that are
not required on a system that users do not log into.

To restore this content, you can run the 'unminimize' command.
Last login: Tue Sep  5 17:25:39 2023 from 127.0.0.1
root@mynode:~# 

V12 seems to have a different onboarding flow Should I test there as well?

r0mant commented 10 months ago

@marcoandredinis Yeah can you please check v12 as well, the client experience shouldn't have really changed with agentless to my knowledge, just how you register the server with the cluster.

marcoandredinis commented 10 months ago

I might be missing something but can you actually register a node in v12? In the sense that it actually appears as a Teleport resource.

I followed this guide https://goteleport.com/docs/ver/12.x/server-access/guides/openssh/ but you don't end up with a node registration (as in, you don't see it when list servers in UI/cli). So, while you get access to it using either ssh or tsh ssh, you always need to pass the port.


As a summary, I think the issue is that, given that you can do tsh ssh <user>@<host> then you should also be able do ssh -F ssh_config_teleport <user>@<host>. That doesn't work, and for ssh you must pass the port. Not a regression from v13, tho. The docs do have a reference for using the port, which kind of makes all of this ok.

Using a single cluster everything works if you pass the port in the ssh command.

For trusted clusters, if you try to access a node in a leaf cluster using ssh it works but fails if you do it a 2nd time: This was also observed by Alex

dinis@lenix ~/p/ssh-combinations (master)> ssh -F ssh_config_teleport_root root@mynode.lenixleaf -p 4444 exit
The authenticity of host '[mynode.lenixleaf]:4444 (<no hostip for proxy command>)' can't be established.
RSA key fingerprint is SHA256:yJsOdyS81xN0rMgypI5AG7DFmAOcxaPvyPErYB7UArk.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[mynode.lenixleaf]:4444' (RSA) to the list of known hosts.
dinis@lenix ~/p/ssh-combinations (master)> ssh -F ssh_config_teleport_root root@mynode.lenixleaf -p 4444 exit
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
SHA256:iJ7GUeuN5B5hVnh8nI+oiWFp0qC1uxpx+6yCd6p/zJw.
Please contact your system administrator.
Add correct host key in /home/dinis/.tsh/known_hosts to get rid of this message.
Offending RSA key in /home/dinis/.tsh/known_hosts:3
Host key for [mynode.lenixleaf]:4444 has changed and you have requested strict checking.
Host key verification failed.
capnspacehook commented 10 months ago

OpenSSH nodes can't be registered pre v13, that's right. I'm assuming @r0mant just meant connecting to OpenSSH nodes with tsh or ssh shouldn't have changed in behavior from v12 to v13+.

I can repro the host key changing bug on a remote cluster though.

Joerger commented 10 months ago

I got curious about this Issue after reading @capnspacehook's status so I took a look as well.

It looks like the agentless node public key exchange is passing the root cluster's Host CA instead of the leaf cluster's. I suspect this is because we are connecting to the leaf cluster through the root Proxy.

In the example below, you can see agentless nodes from both leaf and root pass the root cluster's ssh public key - ssh-rsa SHA256:3mJp7uQQBpToHZ2FN3Fu2rqgo1hwI98bGwK9l0iDeJY

> ssh -v -p 22 agentless-node.leaf.example.com
...
debug1: Server host certificate: ssh-rsa-cert-v01@openssh.com SHA256:8r4gvTFHy8Eb9AlGXku7juX5TGdOmXandxy/9LV4qZ4, serial 0 ID "" CA ssh-rsa SHA256:3mJp7uQQBpToHZ2FN3Fu2rqgo1hwI98bGwK9l0iDeJY valid after 2023-09-08T19:19:17
debug1: No matching CA found. Retry with plain key
...

> ssh -v -p 22 agentless-node.root.example.com
...
debug1: Server host certificate: ssh-rsa-cert-v01@openssh.com SHA256:N1t5HtAUN8dchHRjHta4X8Obp4CqknpUAKujP9S3QvY, serial 0 ID "" CA ssh-rsa SHA256:3mJp7uQQBpToHZ2FN3Fu2rqgo1hwI98bGwK9l0iDeJY valid after 2023-09-08T19:22:07
debug1: Host 'agentless-node.root.example.com' is known and matches the RSA-CERT host certificate.
...

If you update the Root CA entry in ~/.tsh/known_hosts to include the hostname *.leaf.example.com, then you'll get this error instead:

debug1: Found CA key in /home/bjoerger/.tsh/known_hosts:1
Certificate invalid: name is not a listed principal

A better workaround is to jumphost through the leaf cluster proxy. Here's a scuffed example, it'd be cleaner using Proxy Templates:

> cat ~/.ssh/config
...
Host *.leaf.example.com !root.example.com agentless-node
    Port 5022
    ProxyCommand "/home/bjoerger/gravitational/teleport/build/tsh" proxy ssh -J leaf.example.com:5080 %r@agentless-node:%p

> ssh -v -p 22 agentless-node.leaf.example.com
// Works first time without prompting for you to accept unkown host key
strideynet commented 6 months ago

The issue of the proxy connections presenting a certificate signed by the root rather than the leaf for an agentless node is still occurring as of test plan v15.

In addition, the wrong cluster name is used for the principals - leading to the Certificate invalid: name is not a listed principal error if you patch the known_hosts file:

debug1: Server host certificate: ssh-rsa-cert-v01@openssh.com SHA256:28AeviUZNXceNp6sSwAMI65utTJXkH+ngGdLDico5t4, serial 0 ID "" CA ssh-rsa SHA256:Vcx3pHuh1dhowW4ssS7hXLx7Lqc6sfg9NjmaL8wFo2Y valid after 2024-01-16T17:03:50
debug2: Server host certificate hostname: noah-v15-agentless.root.tele.ottr.sh
debug2: Server host certificate hostname: noah-v15-agentless
debug2: Server host certificate hostname: localhost
debug2: Server host certificate hostname: 127.0.0.1
debug2: Server host certificate hostname: ::1
debug2: Server host certificate hostname: f1a3daa0-3582-48fb-b30a-485cddfe45b2.leaf.tele.ottr.sh
debug2: Server host certificate hostname: 34.171.84.37
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts: No such file or directory
debug1: load_hostkeys: fopen /etc/ssh/ssh_known_hosts2: No such file or directory
debug1: Host 'noah-v15-agentless.leaf.tele.ottr.sh' is known and matches the RSA-CERT host certificate.
debug1: Found CA key in /Users/noah/.tsh/known_hosts:2
Certificate invalid: name is not a listed principal
strideynet commented 6 months ago

Moving discussion of the cert issues to https://github.com/gravitational/teleport/issues/36801

Closing this issue as the original reported problem is resolved (it was an issue with the port number)