gardener-attic / gardenctl

Command-line client for the Gardener.
Other
56 stars 42 forks source link

gardenctl v2 - SSH #510

Closed petersutter closed 3 years ago

petersutter commented 3 years ago

Motivation

gardenctl (v1) has the functionality to setup ssh sessions to the targeted shoot cluster. For this, infrastructure resources like vms, firewall rules etc have to be created. gardenctl will clean up the resources after the SSH session. However there were issues in the past where that infrastructure resources did not get cleaned up properly, for example due to some error and was not retried. Hence the proposal, to have a dedicated controller (for each infrastructure) that manages the infrastructure resources. gardenctl also re-used the ssh node credentials for the bastion host. Instead, a new temporary SSH key-pair should be created for the bastion host. The static shoot-specific SSH key-pair should be rotated regularily, for example once in the maintenance time window.

In a previous proposal it was suggested that an extension, running on the seed cluster, watches Bastion custom resources on the garden cluster, acting upon it and updating it's status accordingly. However changes to the Bastion resource should only be allowed for controllers on seeds that are responsible for it. This cannot be restricted when using custom resources. The proposal, as outlined below, suggests to implement the necessary changes in the gardener core components and to adapt the SeedAuthorizer to consider Bastion resources that the GAPI serves.

Goals

Non-Goals

Proposal

Involved Components

The following is a list of involved components, that either need to be newly introduced or extended if already existing

SSH Flow

  1. Users should only get the RBAC permission to create / update Bastion resources for a namespace, if they should be allowed to SSH onto the shoot nodes in this namespace.
  2. User/gardenctlv2 creates Bastion Resource in garden cluster (see resource example below)
    • First, gardenctl would figure out the external IP. Either by calling an external service (gardenctl (v1) uses https://github.com/gardener/gardenctl/blob/master/pkg/cmd/miscellaneous.go#L226) or by calling a binary that prints the external IP(s) to stdt out. The binary should be configurable. The result is set under spec.clientIP
    • the public PGP key of the user is set under spec.publicKey. The key that should be used needs to be configured beforehand by the user
    • The targeted shoot is set under spec.shootRef
  3. Admission Control for the Bastion resource under api group operations.gardener.cloud in the garden cluster
    • Mutating Webhook
      • according to shootRef, sets the spec.seedName
      • according to shootRef, sets the spec.providerType
      • on creation, sets metadata.annotations["operations.gardener.cloud/created-by"] according to the user that created the resource
    • Validating Webhook for the Bastion resource
      • For security reasons, it validates that only the user who created the resource can update the spec so that for example another user cannot sneak in his own ip for the bastion firewall rule and own public PGP key, to be able to decrypt the private SSH key
  4. gardenlet
    • Watches Bastion resource for own seed under api group operations.gardener.cloud in the garden cluster
    • Creates Bastion custom resource under api group extensions.gardener.cloud/v1alpha1 in the seed cluster
  5. Gardener extension provider / Bastion Controller on Seed:
    • With own Bastion Custom Resource Definition in the seed under the api group extensions.gardener.cloud/v1alpha1
    • Watches Bastion custom resources that are created by the gardenlet in the seed
    • Creates SSH key-pair in memory. Stores the secret key encrypted under status.id_rsa.enc, using spec.publicKey. Stores the public key under status.id_rsa.pub
    • Controller reads cloudprovider credentials from seed-shoot namespace
    • Deploy infrastructure resources
    • Updates status of Bastion resource:
      • With bastion IP under status.bastionIP
      • Sets status.state to Ready on resource so that the client knows when to initiate the SSH connection
  6. gardenlet
    • Once the Bastion resource is in ready state, it syncs back the state to the garden cluster
  7. gardenctl
    • initiates SSH session
      • reads status["id_rsa.enc"], decrypts it with users private PGP key
      • reads bastion IP from status.bastionIP
      • reads the private key from the SSH key-pair for the shoot node
      • opens SSH to the bastion and from there to the respective shoot node
    • runs heartbeat in parallel as long as the SSH session is open by annotating the Bastion resource with operations.gardener.cloud/operation: keepalive
  8. GCM:
    • Once status.expirationDate is reached, the Bastion will be marked for deletion
  9. gardenlet:
    • Once the Bastion resource in the garden cluster is marked for deletion, it marks the Bastion resource in the seed for deletion.
  10. Gardener extension provider / Bastion Controller on Seed:
    • all created resources will be cleaned up
    • On succes, removes finalizer on Bastion resource in seed
  11. gardenlet:
    • removes finalizer on Bastion resource in garden cluster

Example Bastion resource in the garden cluster

apiVersion: operations.gardener.cloud/v1alpha1
kind: Bastion
metadata:
  generateName: cli-
  name: cli-abcdef
  namespace: garden-myproject
  annotations:
    operations.gardener.cloud/created-by: foo # set by the mutating webhook
    operations.gardener.cloud/last-heartbeat-at: "2021-03-19T11:58:00Z"
    # operations.gardener.cloud/operation: keepalive # this annotation is removed by the mutating webhook and the last-heartbeat timestamp and/or the status.expirationDate will be updated accordingly
spec:
  shootRef: # namespace cannot be set / it's the same as .metadata.namespace
    name: my-cluster

  # seedName: aws-eu2 # is set by the mutating webhook
  # providerType: aws # is set by the mutating webhook

  publicKey: LS0tLS1CRUdJTiBQR1AgUFVCTElDIEtFWSBCTE9DSy0tLS0tCi4uLgotLS0tLUVORCBQR1AgUFVCTElDIEtFWSBCTE9DSy0tLS0tCg== # user's PGP public key.

  clientIP: # external ip of the user
    ipv4: 1.2.3.4
    # ipv6: ::1

status:
  # the following fields are managed by the controller in the seed and synced by gardenlet
  bastionIP: 1.2.3.5
  state: Ready
  id_rsa.enc: LS0tLS1CRUdJTiBQR1AgTUVTU0FHRS0tLS0tCi4uLgotLS0tLUVORCBQR1AgTUVTU0FHRS0tLS0tCg== # ssh private key, enrypted with spec.publicKey
  id_rsa.pub: c3NoLXJzYSAuLi4K

  # the following fields are only set by the mutating webhook
  expirationDate: "2021-03-19T12:58:00Z" # extended on each keepalive

Bastion custom resource in the seed cluster

apiVersion: extensions.gardener.cloud/v1alpha1
kind: Bastion
metadata:
  name: cli-abcdef
  namespace: shoot--myproject--mycluster
  annotations:
    operations.gardener.cloud/created-by: foo # set by the mutating webhook
    operations.gardener.cloud/last-heartbeat-at: "2021-03-19T11:58:00Z"
spec:
  publicKey: LS0tLS1CRUdJTiBQR1AgUFVCTElDIEtFWSBCTE9DSy0tLS0tCi4uLgotLS0tLUVORCBQR1AgUFVCTElDIEtFWSBCTE9DSy0tLS0tCg== # user's PGP public key.

  clientIP: # external ip of the user
    ipv4: 1.2.3.4
    # ipv6: ::1

status:
  bastionIP: 1.2.3.5
  state: Ready
  id_rsa.enc: LS0tLS1CRUdJTiBQR1AgTUVTU0FHRS0tLS0tCi4uLgotLS0tLUVORCBQR1AgTUVTU0FHRS0tLS0tCg== # ssh private key, enrypted with spec.publicKey
  id_rsa.pub: c3NoLXJzYSAuLi4K

  expirationDate: "2021-03-19T12:58:00Z"

SSH Key-Pair Rotation

Currently, the SSH key-pair for the shoot nodes are created once during shoot cluster creation. These key-pairs should be rotated on a regular basis.

Proposal

petersutter commented 3 years ago

as just discussed with @rfranzke it rather makes sense to open a GEP for this. I will prepare it and then close this issue

petersutter commented 3 years ago

GEP-15 https://github.com/gardener/gardener/pull/3802