gravitational / teleport

The easiest, and most secure way to access and protect all of your infrastructure.
https://goteleport.com
GNU Affero General Public License v3.0
17.56k stars 1.75k forks source link

`tsh ssh` fails on some nodes with `ERROR: ssh: rejected: administratively prohibited (creating teleport system groups \ user: lookup groupname teleport-keep: connection refused)` #46277

Closed gdubicki closed 1 month ago

gdubicki commented 1 month ago

Expected behavior:

tsh ssh <node> should work.

Current behavior:

On some of our nodes it fails with:

$ tsh ssh <node>
ERROR: ssh: rejected: administratively prohibited (creating teleport system groups
    user: lookup groupname teleport-keep: connection refused)

Bug details:

Client and proxy:

$ tsh version
Teleport v16.2.0 git:v16.2.0-0-g68369de go1.22.6
Proxy version: 16.2.0
Proxy: <company>.teleport.sh:443

On the node:

root@node:~# teleport version
Teleport Enterprise v16.2.0 git:v16.2.0-0-g68369de go1.22.6

See above.

$ tsh ssh -d <node>
2024-09-05T13:55:21+02:00 INFO [CLIENT]    ALPN connection upgrade required for "company.teleport.sh:443": false. client/api.go:831
2024-09-05T13:55:21+02:00 INFO [CLIENT]    [KEY AGENT] Connected to the system agent: "/private/tmp/com.apple.launchd.s0ALyzgams/Listeners" client/api.go:4580
2024-09-05T13:55:21+02:00 DEBU [KEYSTORE]  Reading certificates from path "/Users/gdubicki/.tsh/keys/company.teleport.sh/gdubicki-ssh/company.teleport.sh-cert.pub". client/keystore.go:357
2024-09-05T13:55:21+02:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2024-09-05 19:37:03 +0000 UTC". client/client_store.go:118
2024-09-05T13:55:21+02:00 INFO [KEYAGENT]  Loading SSH key for user "gdubicki" and cluster "company.teleport.sh". client/keyagent.go:198
2024-09-05T13:55:21+02:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2024-09-05 19:37:03 +0000 UTC". client/client_store.go:118
2024-09-05T13:55:21+02:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2024-09-05 19:37:03 +0000 UTC". client/client_store.go:118
2024-09-05T13:55:21+02:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2024-09-05 19:37:03 +0000 UTC". client/client_store.go:118
2024-09-05T13:55:21+02:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2024-09-05 19:37:03 +0000 UTC". client/client_store.go:118
2024-09-05T13:55:21+02:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2024-09-05 19:37:03 +0000 UTC". client/client_store.go:118
2024-09-05T13:55:21+02:00 DEBU [KEYSTORE]  Reading certificates from path "/Users/gdubicki/.tsh/keys/company.teleport.sh/gdubicki-ssh/company.teleport.sh-cert.pub". client/keystore.go:357
2024-09-05T13:55:21+02:00 DEBU [KEYSTORE]  Reading certificates from path "/Users/gdubicki/.tsh/keys/company.teleport.sh/gdubicki-db/company.teleport.sh". client/keystore.go:357
2024-09-05T13:55:21+02:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2024-09-05 19:37:03 +0000 UTC". client/client_store.go:118
2024-09-05T13:55:21+02:00 DEBU [CLIENT]    Attempting to issue a single-use user certificate with an MFA check. client/cluster_client.go:301
2024-09-05T13:55:21+02:00 DEBU  attaching new resumable connection trace_id:d80a1b958c603921618875fc484340a6 span_id:aed30240b20ab7cc resumption/client.go:284
2024-09-05T13:55:22+02:00 DEBU [KEYSTORE]  Teleport TLS certificate valid until "2024-09-05 19:37:03 +0000 UTC". client/client_store.go:118
2024-09-05T13:55:22+02:00 DEBU [KEYAGENT]  "Checking key: ssh-rsa-cert-v01@openssh.com <redacted>\n." client/keyagent.go:370
2024-09-05T13:55:22+02:00 DEBU [KEYAGENT]  Validated host node:0@default@company.teleport.sh. client/keyagent.go:376
2024-09-05T13:55:22+02:00 DEBU [CLIENT]    MFA requirement from CreateAuthenticateChallenge, MFARequired=MFA_REQUIRED_NO client/cluster_client.go:593
2024-09-05T13:55:23+02:00 DEBU  handling new resumable connection error:[
ERROR REPORT:
Original Error: *trace.ConnectionProblemError failed to send on source: EOF
Stack Trace:
    github.com/gravitational/teleport/api@v0.0.0/utils/grpc/stream/stream.go:118 github.com/gravitational/teleport/api/utils/grpc/stream.(*ReadWriter).Write
    github.com/gravitational/teleport/lib/resumption/resumable.go:412 github.com/gravitational/teleport/lib/resumption.runResumeV1Write
    github.com/gravitational/teleport/lib/resumption/resumable.go:169 github.com/gravitational/teleport/lib/resumption.runResumeV1Unlocking.func5
    golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 golang.org/x/sync/errgroup.(*Group).Go.func1
    runtime/asm_arm64.s:1222 runtime.goexit
User Message: write loop
    writing frame header
        failed to send on source: EOF] trace_id:d80a1b958c603921618875fc484340a6 span_id:aed30240b20ab7cc resumption/client.go:292

ERROR REPORT:
Original Error: *ssh.OpenChannelError ssh: rejected: administratively prohibited (creating teleport system groups
    user: lookup groupname teleport-keep: connection refused)
Stack Trace:
    github.com/gravitational/teleport/api@v0.0.0/observability/tracing/ssh/client.go:278 github.com/gravitational/teleport/api/observability/tracing/ssh.(*clientWrapper).NewSession
    github.com/gravitational/teleport/api@v0.0.0/observability/tracing/ssh/client.go:207 github.com/gravitational/teleport/api/observability/tracing/ssh.(*Client).newSession
    github.com/gravitational/teleport/api@v0.0.0/observability/tracing/ssh/client.go:176 github.com/gravitational/teleport/api/observability/tracing/ssh.(*Client).NewSessionWithRequestCallback
    github.com/gravitational/teleport/lib/client/session.go:223 github.com/gravitational/teleport/lib/client.(*NodeSession).createServerSession
    github.com/gravitational/teleport/lib/client/session.go:305 github.com/gravitational/teleport/lib/client.(*NodeSession).interactiveSession
    github.com/gravitational/teleport/lib/client/session.go:522 github.com/gravitational/teleport/lib/client.(*NodeSession).runShell
    github.com/gravitational/teleport/lib/client/client.go:403 github.com/gravitational/teleport/lib/client.(*NodeClient).RunInteractiveShell
    github.com/gravitational/teleport/lib/client/api.go:1987 github.com/gravitational/teleport/lib/client.(*TeleportClient).runShellOrCommandOnSingleNode
    github.com/gravitational/teleport/lib/client/api.go:1718 github.com/gravitational/teleport/lib/client.(*TeleportClient).SSH
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:3610 github.com/gravitational/teleport/tool/tsh/common.onSSH.func1.1
    github.com/gravitational/teleport/lib/client/api.go:597 github.com/gravitational/teleport/lib/client.RetryWithRelogin
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:3603 github.com/gravitational/teleport/tool/tsh/common.onSSH.func1
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:3415 github.com/gravitational/teleport/tool/tsh/common.retryWithAccessRequest
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:3602 github.com/gravitational/teleport/tool/tsh/common.onSSH
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:1356 github.com/gravitational/teleport/tool/tsh/common.Run
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:609 github.com/gravitational/teleport/tool/tsh/common.Main
    github.com/gravitational/teleport/tool/tsh/main.go:26 main.main
    runtime/proc.go:271 runtime.main
    runtime/asm_arm64.s:1222 runtime.goexit
User Message: ssh: rejected: administratively prohibited (creating teleport system groups
    user: lookup groupname teleport-keep: connection refused)

Please note that we have not made any recent changes on this node, but it running chef and is applying some automated updates. Also the Teleport itself is being automatically updated there.

gdubicki commented 1 month ago

Some Teleport logs from that node (I was able to ssh to it differently):

Sep 05 11:46:52 node teleport[2462368]: 2024-09-05T11:46:52Z INFO             Successfully synced "unit" upgrader maintenance window value. upgradewindow/upgradewindow.go:296
Sep 05 11:47:58 node teleport[2462368]: 2024-09-05T11:47:58Z WARN [FILE]      Skipping upload 35a45f6a-9f29-4c28-8198-10463a91bfa4, missing subdirectory. filesessions/filestream.go:313
Sep 05 11:52:55 node teleport[2462368]: 2024-09-05T11:52:55Z WARN [FILE]      Skipping upload 35a45f6a-9f29-4c28-8198-10463a91bfa4, missing subdirectory. filesessions/filestream.go:313
Sep 05 11:53:15 node teleport[2462368]: 2024-09-05T11:53:15Z INFO  handling new resumable SSH connection resumption/server_exchange.go:92
Sep 05 11:53:15 node teleport[2462368]: 2024-09-05T11:53:15Z INFO  handing resumable connection to the SSH server resumption/server_exchange.go:136
Sep 05 11:53:16 node teleport[2462368]: 2024-09-05T11:53:16Z WARN [SSH:NODE]  "Dropping inbound ssh connection due to error: creating teleport system groups\n\tuser: lookup groupname teleport-keep: connection refused" sshutils/server.go:580
Sep 05 11:53:16 node teleport[2462368]: 2024-09-05T11:53:16Z INFO  resumable connection completed resumption/server_exchange.go:138
Sep 05 11:55:21 node teleport[2462368]: 2024-09-05T11:55:21Z INFO  handling new resumable SSH connection resumption/server_exchange.go:92
Sep 05 11:55:21 node teleport[2462368]: 2024-09-05T11:55:21Z INFO  handing resumable connection to the SSH server resumption/server_exchange.go:136
Sep 05 11:55:22 node teleport[2462368]: 2024-09-05T11:55:22Z WARN [SSH:NODE]  "Dropping inbound ssh connection due to error: creating teleport system groups\n\tuser: lookup groupname teleport-keep: connection refused" sshutils/server.go:580
Sep 05 11:55:23 node teleport[2462368]: 2024-09-05T11:55:23Z INFO  resumable connection completed resumption/server_exchange.go:138
Sep 05 11:57:52 node teleport[2462368]: 2024-09-05T11:57:52Z WARN [FILE]      Skipping upload 35a45f6a-9f29-4c28-8198-10463a91bfa4, missing subdirectory. filesessions/filestream.go:313
Sep 05 12:02:21 node teleport[2462368]: 2024-09-05T12:02:21Z WARN [FILE]      Skipping upload 35a45f6a-9f29-4c28-8198-10463a91bfa4, missing subdirectory. filesessions/filestream.go:313
Sep 05 12:03:35 node teleport[2462368]: 2024-09-05T12:03:35Z WARN  Access denied to instance labels, does the instance have compute.instances.get permission? gcp/imds.go:201
Sep 05 12:03:36 node teleport[2462368]: 2024-09-05T12:03:36Z WARN  Access denied to resource management tags, does the instance have compute.instances.listEffectiveTags permission? gcp/imds.go:210
Sep 05 12:03:36 node teleport[2462368]: 2024-09-05T12:03:36Z WARN  Access denied to instance labels, does the instance have compute.instances.get permission? gcp/imds.go:201
Sep 05 12:03:36 node teleport[2462368]: 2024-09-05T12:03:36Z WARN  Access denied to resource management tags, does the instance have compute.instances.listEffectiveTags permission? gcp/imds.go:210
gdubicki commented 1 month ago

A restart of the Teleport service on the node did not help.

gdubicki commented 1 month ago

The Teleport config (/etc/teleport.yaml) on the affected node:

version: v3
teleport:
  nodename: <node-name>
  data_dir: /var/lib/teleport
  join_params:
    token_name: /var/lib/teleport/tokens/join_token.file
    method: token
  proxy_server: <company>.teleport.sh:443
  log:
    output: stderr
    severity: INFO
    format:
      output: text
  ca_pin: ""
  diag_addr: ""
auth_service:
  enabled: "no"
ssh_service:
  enabled: "yes"
  labels:
    cloud: gcp
  pam:
    enabled: "yes"
proxy_service:
  enabled: "no"
  https_keypairs: []
  https_keypairs_reload_interval: 0s
  acme: {}
eriktate commented 1 month ago

Hi @gdubicki! :wave: Could you share some more details about your environment? The error message suggests there may be an external group database in the mix, is that true?

gdubicki commented 1 month ago

Sure, @eriktate! The node is Ubuntu 22.04 LTS but it doesn't have any external group database. Our nodes are managed with Chef.

This problem didn't arise on different nodes which have the same OS and a very similar setup.

gdubicki commented 1 month ago

As @programmerq has suggested (thanks!) in the support ticket that I created in parallel, after enabling DEBUG logging we can see this:

Sep 09 15:15:18 node teleport[682248]: 2024-09-09T15:15:18Z DEBU [NODE]      Checking permissions for (gdubicki,gdubicki) to login to node with RBAC checks. srv/authhandlers.go:621
Sep 09 15:15:18 node teleport[682248]: 2024-09-09T15:15:18Z DEBU [SSH:NODE]  Incoming connection <...>:60227 -> 10.138.0.13:59932 version: SSH-2.0-Go, certtype: "user" sshutils/server.go:553
Sep 09 15:15:18 node teleport[682248]: 2024-09-09T15:15:18Z DEBU             "/usr/sbin/groupadd output: groupadd: group 'teleport-system' already exists\n" host/hostusers.go:55
Sep 09 15:15:18 node teleport[682248]: 2024-09-09T15:15:18Z DEBU             "Error creating user gdubicki: creating teleport system groups\n\tuser: lookup groupname teleport-keep: connection refused" srv/sess.go:293
Sep 09 15:15:18 node teleport[682248]: 2024-09-09T15:15:18Z WARN [SSH:NODE]  "Dropping inbound ssh connection due to error: creating teleport system groups\n\tuser: lookup groupname teleport-keep: connection refused" sshutils/server.go:580
Sep 09 15:15:18 node teleport[682248]: 2024-09-09T15:15:18Z INFO  resumable connection completed resumption/server_exchange.go:138
Sep 09 15:15:18 node teleport[682248]: 2024-09-09T15:15:18Z DEBU  handling new resumable connection error:[
Sep 09 15:15:18 node teleport[682248]: ERROR REPORT:
Sep 09 15:15:18 node teleport[682248]: Original Error: poll.errNetClosing use of closed network connection
Sep 09 15:15:18 node teleport[682248]: Stack Trace:
Sep 09 15:15:18 node teleport[682248]:         github.com/gravitational/teleport/lib/resumption/resumable.go:395 github.com/gravitational/teleport/lib/resumption.runResumeV1Write
Sep 09 15:15:18 node teleport[682248]:         github.com/gravitational/teleport/lib/resumption/resumable.go:169 github.com/gravitational/teleport/lib/resumption.runResumeV1Unlocking.func5
Sep 09 15:15:18 node teleport[682248]:         golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 golang.org/x/sync/errgroup.(*Group).Go.func1
Sep 09 15:15:18 node teleport[682248]:         runtime/asm_amd64.s:1695 runtime.goexit
Sep 09 15:15:18 node teleport[682248]: User Message: write loop
Sep 09 15:15:18 node teleport[682248]:         connection closed
Sep 09 15:15:18 node teleport[682248]:                 use of closed network connection] resumption/server_exchange.go:144

When I check if the group exists, it in fact does:

root@node:~# getent group teleport-system
teleport-system:x:1043:
root@node:~# echo $?
0
eriktate commented 1 month ago

Interesting 🤔 The group and user lookups during host user creation defer to getgrname_r and getpwnam_r, which should be the exact same functions supporting getent group teleport-system. @gdubicki would you be able to share this host's /etc/nsswitch.conf? The network connection errors are still confusing me as I would expect a different error if there was an issue reading from your local group and/or passwd database

gdubicki commented 1 month ago

Hey @eriktate! I am for now working on this with your support team, but I will share the results when we are done.

steve-nscale commented 1 month ago

This also affects v15.4.18 recently added a new host, and ran into a simlar problem. Our error message is

ERROR REPORT:
Original Error: *ssh.OpenChannelError ssh: rejected: administratively prohibited (user: unknown user srelf)
Stack Trace:
    github.com/gravitational/teleport/api@v0.0.0/observability/tracing/ssh/client.go:236 github.com/gravitational/teleport/api/observability/tracing/ssh.(*clientWrapper).NewSession
    github.com/gravitational/teleport/api@v0.0.0/observability/tracing/ssh/client.go:200 github.com/gravitational/teleport/api/observability/tracing/ssh.(*Client).NewSession
    github.com/gravitational/teleport/lib/client/session.go:219 github.com/gravitational/teleport/lib/client.(*NodeSession).createServerSession
    github.com/gravitational/teleport/lib/client/session.go:301 github.com/gravitational/teleport/lib/client.(*NodeSession).interactiveSession
    github.com/gravitational/teleport/lib/client/session.go:518 github.com/gravitational/teleport/lib/client.(*NodeSession).runShell
    github.com/gravitational/teleport/lib/client/client.go:1592 github.com/gravitational/teleport/lib/client.(*NodeClient).RunInteractiveShell
    github.com/gravitational/teleport/lib/client/api.go:1919 github.com/gravitational/teleport/lib/client.(*TeleportClient).runShellOrCommandOnSingleNode
    github.com/gravitational/teleport/lib/client/api.go:1642 github.com/gravitational/teleport/lib/client.(*TeleportClient).SSH
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:3481 github.com/gravitational/teleport/tool/tsh/common.onSSH.func1.1
    github.com/gravitational/teleport/lib/client/api.go:595 github.com/gravitational/teleport/lib/client.RetryWithRelogin
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:3480 github.com/gravitational/teleport/tool/tsh/common.onSSH.func1
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:3318 github.com/gravitational/teleport/tool/tsh/common.retryWithAccessRequest
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:3479 github.com/gravitational/teleport/tool/tsh/common.onSSH
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:1325 github.com/gravitational/teleport/tool/tsh/common.Run
    github.com/gravitational/teleport/tool/tsh/common/tsh.go:593 github.com/gravitational/teleport/tool/tsh/common.Main
    github.com/gravitational/teleport/tool/tsh/main.go:26 main.main
    runtime/proc.go:267 runtime.main
    runtime/asm_arm64.s:1197 runtime.goexit
User Message: ssh: rejected: administratively prohibited (user: unknown user srelf)

Our main teleport server that runs auth etc, is running Teleport v15.4.9

Downgrading to v15.4.14 fixes the problem, so its an issue introduced betwen v15.4.14 and v15.4.18

Rgds Steve.

TeleLos commented 1 month ago

Observed with settings:

# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.

passwd:         compat systemd
group:          compat systemd
shadow:         compat
gshadow:        files

hosts:          files dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis
gdubicki commented 1 month ago

Quoting a solution from Teleport Engineer we got via Support ticket:

  1. If you don't need or don't want the NIS integration, you can change compat to files in your /etc/nsswitch.conf for user and group.

As we are not using NIS, we did that and it resolved our problem.