projectcapsule / capsule

Multi-tenancy and policy-based framework for Kubernetes.
https://projectcapsule.dev/
Apache License 2.0
1.6k stars 155 forks source link

Document how Capsule integrates in a Flux GitOps based environment #528

Closed bsctl closed 2 years ago

bsctl commented 2 years ago

Describe the feature

Document how Capsule integrates in a Flux GitOps based environment.

What would the new user story look like?

The cluster admin can learn how to configure a Flux GitOps based environment with Capsule

Expected behaviour

A detailed user guide is provided into documentation.

oliverbaehler commented 2 years ago

Comment for Assignment

bsctl commented 2 years ago

Thanks @oliverbaehler really appreciate.

maxgio92 commented 2 years ago

Hey @bsctl I was working exactly on the same use case here. And I then came across the issue @oliverbaehler just discussed in #582.

In my PoC here the patch and list verb permission the GitOps reconciler would need.

Tenant self-service reconciliation

TL;DR: The tenant owner needs permission to patch (also list) cluster-scoped Kubernetes resources/objects too. The flux kustomize-controller in the end will kubectl apply the desired state, and as in the PoC it will do it impersonating the Tenant Owner user.

More details on #582 - I recommend reading it.

Enter Capsule Proxy

I see that this use case is perfect for Capsule Proxy. In this way, the tenant resource control and isolation would happen on a step before during the request process as the request would be proxied instead of being validated.

Then, Capsule Proxy would provide the tenant-scoped view of cluster-scoped resources, to the Tenant Owner. From this point on, the Tenant Owner would be safe to list and patch Namespaces, as he could it do only on his ones.

Also, no further complexity would be introduced.

Flux + Capsule + Capsule Proxy

In a nutshell, the Kustomize reconciler in order to reconcile the desired state of the tenant - which would be ideally declared and versioned on Git by the tenant owner itself - would impersonate the Tenant Owner SA user and communicate to the Capsule Proxy to also operate on cluster-scoped resources (e.g. Namespace-as-a-service) - and nothing more!

The tenant can configure it through the Kustomization spec.kubeConfig field.

For example, a Tenant:

apiVersion: capsule.clastix.io/v1beta1
kind: Tenant
metadata:
  name: dev-team
spec:
...
  owners:
  - name: system:serviceaccount:dev-team:gitops-reconciler
    kind: ServiceAccount

would declare its Namespaces, through some Kustomization like this:

apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: namespaces
  namespace: dev-team
spec:
  kubeConfig:
    secretRef:
      name: capsule-proxy
      key: kubeconfig
  interval: 1m
  sourceRef:
    kind: GitRepository
    name: dev-team
  path: ./staging/namespaces
  prune: true

where the capsule-proxy Secret kubeconfig would contain the tenant owner SA (dev-team:gitops-reconciler) token and Capsule proxy svc endpoint and CA certificate.

The missing piece

As of now, on this idea the missing piece is the kubeConfig Secret being automatically in-place with the Tenant Owner ServiceAccount and up-to-date with its token, as the kubeconfig expects a secretRef field.

from here.

WIP

/cc @prometherion

oliverbaehler commented 2 years ago

Yeah that's what our current setup looks like. We have shell-operator in place which dumps kubeconfigs from serviceaccounts into secrets which access the capsule proxy's internal url. We could add something like that to the capsule proxy.

You will still need to get around, since you implement a huge security exploit when just allowing patch privileges #582

maxgio92 commented 2 years ago

TL;DR what is missing in this scenario is:

  1. a controller to ensure and update Tenant Owners' kubeconfig Secrets
  2. (in order to disable --default-service-account) a controller to ensure SA field on Flux Kustomization when don't have spec.kubeConfig set, i.e. when not using Capsule Proxy - e.g. it can be done with Kyverno Policies
  3. 582 for which the PR #584

  4. (optional) to enforce tenants (Kustomization) reconciliation through Capsule Proxy (ensure Kustomization.spec.kubeConfig)

About Flux

In the end we should be able to apply also Flux's multi-tenancy lockdown features:

except for:

oliverbaehler commented 2 years ago

Hi @maxgio92, I have tried the approach to allow Flux within tenants, meaning that cross references within namespaces in the same tenant are such which are marked as public, are allowed. But we don't have the time to maintain such policies over time and we are also moving towards argo, so i guess we won't need them anymore (They weren't used in production yet). Maybe it's something usefu. See the following policies:

helmrelease.policy.yaml

---
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: flux-helmrelease-cross-reference
  annotations:
    policies.kyverno.io/title: Flux Helmrelease Cross Reference
    policies.kyverno.io/category: Flux
    policies.kyverno.io/subject: HelmRelease
    policies.kyverno.io/description: >-
      Disallows cross namespaces references of HelmRelease Resources.
spec:
  validationFailureAction: enforce
  background: false
  rules:

    # Defaults all namespace attributes to the namespace the Helmrelease is installed into.
    # Does not overwrite if set
    - name: HelmRelease Default Namespaces
      preconditions:
        any:
          - key: "{{ request.operation }}"
            operator: In
            value: [ "CREATE", "UPDATE" ]
      match:
        all: 
          - resources:
              kinds:
                - HelmRelease
      exclude:
        any:
          - clusterRoles:
            - cluster-admin
      mutate:
        patchStrategicMerge:
          spec:
            +(targetNamespace): "{{request.object.metadata.namespace}}"
            +(storageNamespace): "{{request.object.metadata.namespace}}"
            +(spec):
              +(sourceRef):
                +(namespace): "{{request.object.metadata.namespace}}"

    # Disallow Source References 
    # Unless in Public Namespace or same Tenant
    - name: helmrelease-source-cross-reference
      context:

        # Load Gloabl Configuration
        - name: global
          configMap:
            name: kyverno-global-config
            namespace: kyverno-system

        # Get All Public Namespaces
        - name: public_namespaces
          apiCall:
            urlPath: "/api/v1/namespaces"
            jmesPath: "items[?metadata.labels.\"{{global.data.public_identifier_label}}\" == '{{global.data.public_identifier_value}}'].metadata.name | []" 

        # Get Tenant information from source namespace
        # Defaults to a character, which can't be a label value
        - name: source_tenant
          apiCall:
            urlPath: "/api/v1/namespaces/{{request.object.metadata.namespace}}"
            jmesPath: "metadata.labels.\"{{global.data.tenant_identifier_label}}\" | '?'"

        # Get Tenant information from destination namespace
        # Returns Array with Tenant Name or Empty
        - name: destination_tenant
          apiCall:
            urlPath: "/api/v1/namespaces"
            jmesPath: "items[?metadata.name == '{{request.object.spec.chart.spec.sourceRef.namespace}}'].metadata.labels.\"{{global.data.tenant_identifier_label}}\""

      preconditions:
        all:
          - key: "{{ request.operation }}"
            operator: In
            value: [ "CREATE", "UPDATE" ]
        any: 

          # Source is not Self-Reference  
          - key: "{{request.object.spec.chart.spec.sourceRef.namespace}}"
            operator: NotEquals
            value: "{{request.object.metadata.namespace}}"

          # Source not in Public Namespaces
          - key: "{{request.object.spec.chart.spec.sourceRef.namespace}}"
            operator: NotIn
            value: "{{public_namespaces}}"

          # Source not in Destination
          - key: "{{request.object.spec.chart.spec.sourceRef.namespace}}"
            operator: NotIn
            value: "{{destination_tenant}}"

      match:
        all: 
          - resources:
              kinds:
                - HelmRelease
      exclude:
        any:
          - clusterRoles:
            - cluster-admin
      validate:
        message: "Can not use namespace {{request.object.spec.chart.spec.sourceRef.namespace}} as source reference!"
        deny: {}

kustomization.policy.yaml

---
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: flux-kustomization-cross-reference
  annotations:
    policies.kyverno.io/title: Flux Kustomization Cross Reference
    policies.kyverno.io/category: Flux
    policies.kyverno.io/subject: Kustomization
    policies.kyverno.io/description: >-
      Disallows cross namespaces references of Kustomization Resources
spec:
  validationFailureAction: enforce
  background: false
  rules:
    - name: flux-kustomization-defaults
      preconditions:
        any:
          - key: "{{ request.operation }}"
            operator: In
            value: [ "CREATE", "UPDATE" ]
      match:
        all: 
          - resources:
              kinds:
                - Kustomization
      exclude:
        any:
          - clusterRoles:
            - cluster-admin
      mutate:
        patchStrategicMerge:
          spec:
            +(targetNamespace): "{{request.object.metadata.namespace}}"
            +(sourceRef):
              +(namespace): "{{request.object.metadata.namespace}}"

    # Disallow Source References 
    # Unless in Public Namespace or same Tenant
    - name: helmrelease-source-cross-reference
      context:

        # Load Gloabl Configuration
        - name: global
          configMap:
            name: kyverno-global-config
            namespace: kyverno-system

        # Get All Public Namespaces
        - name: public_namespaces
          apiCall:
            urlPath: "/api/v1/namespaces"
            jmesPath: "items[?metadata.labels.\"{{global.data.public_identifier_label}}\" == '{{global.data.public_identifier_value}}'].metadata.name | []" 

        # Get Tenant information from source namespace
        # Defaults to a character, which can't be a label value
        - name: source_tenant
          apiCall:
            urlPath: "/api/v1/namespaces/{{request.object.metadata.namespace}}"
            jmesPath: "metadata.labels.\"{{global.data.tenant_identifier_label}}\" | '?'"

        # Get Tenant information from destination namespace
        # Returns Array with Tenant Name or Empty
        - name: destination_tenant
          apiCall:
            urlPath: "/api/v1/namespaces"
            jmesPath: "items[?metadata.name == '{{request.object.spec.chart.spec.sourceRef.namespace}}'].metadata.labels.\"{{global.data.tenant_identifier_label}}\""

      preconditions:
        all:
          - key: "{{ request.operation }}"
            operator: In
            value: [ "CREATE", "UPDATE" ]

          - key: "{{request.object.spec.sourceRef.namespace}}"
            operator: NotIn
            value: "{{public_namespaces}}"

          - key: "{{request.object.spec.targetNamespace}}"
            operator: NotIn
            value: "{{destination_tenant}}"

      match:
        all: 
          - resources:
              kinds:
                - Kustomization
      exclude:
        any:
          - clusterRoles:
            - cluster-admin
      validate:
        message: "Can not use namespace as source reference, namespace must be public or within tenant!"
        deny: {}

global.config.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: kyverno-global-config
  namespace: kyverno-system
data:
  #
  ## Flux Configurations
  #

  public_identifier_label: "company.com/public"
  public_identifier_value: "true"

  #
  ## Tenant Configurations
  #

  ## tenant_identifier_label
  #    Label which is used to select the tenant name
  tenant_identifier_label: "capsule.clastix.io/tenant"
maxgio92 commented 2 years ago

Thank you @oliverbaehler !

prometherion commented 2 years ago

2. (in order to disable --default-service-account) a controller to ensure SA field on Flux Kustomization when don't have spec.kubeConfig set, i.e. when not using Capsule Proxy - e.g. it can be done with Kyverno Policies

Could we make this optional right now and use the Kyverno policies shared by @oliverbaehler so we can close this with the docs update?

oliverbaehler commented 2 years ago

@prometherion If we want to propose these i would take some time to harden them and verify that they are working as expect, not that we create a security leak.

My intention was to share reconciler resources which are common (eg. bitnami helm repository is in a public namespace). But I don't know if I am still a fan of this idea, since these policies rely on api calls which decrease cluster performance (API Requests)

prometherion commented 2 years ago

It depends on the time since I'd like to publish the v0.1.2 before the next community call, expected in 2 weeks.

Do you think that's feasible according to your availability?

My intention was to share reconciler resources which are common (eg. bitnami helm repository is in a public namespace). But I don't know if I am still a fan of this idea, since these policies rely on api calls which decrease cluster performance (API Requests)

I'm missing the context here, could you elaborate a bit more?

maxgio92 commented 2 years ago

@prometherion TL;DR without point 2 a Tenant could bypass Capsule and modify system resources (e.g. in kube-system NS) or other Tenants' resources.

Point 2 is important to avoid that Tenant resources are reconciled with cluster-admin privileges. This would happen because we cant' enforce reconciliation with impersonation of the default SA of the same namespace of the Kustomization (i.e. reachable with Kustomize controller flag --default-service-account=<name>), instead by default no impersonation is done so the reconciliation would be done with default Kustomize controller's SA and related cluster-admin privileges on the whole cluster.

@oliverbaehler I'm going to test the PoC with the new patch verb support. In any case, I'd avoid to provide support for cross-namespace Flux CR reference instead, as stated above, leverage targetNamespace of Flux reconcile-type resources to choose where to apply. PS: at least for this Capsule release :-)

Keep you posted.

maxgio92 commented 2 years ago

I wrote down these test definitions:

maxgio92 commented 2 years ago

TL;DR The last two points can be achieved without policies, instead enabling also the default SA impersonation-feature of Flux MultiTenancy Lockdown. What is needed is the privilege of the Tenant Owner to impersonate himself, and add support in Capsule Proxy for requests with impersonation headers.

Moreover, with this approach we remove the dependency of a further policy engine.

maxgio92 commented 2 years ago

Update: all the points are achieved. I updated the one before the last by removing the fact that spec.kubeconfig should not be empty. This is not required now, but the important part is that Capsule Proxy enables list operations for the Tenant GitOps Reconciler.

Even though not-so-elegant, the Tenant GitOps Reconciler communicates to the Capsule Proxy impersonating himself (ServiceAccount's User), because we need to set mandatory default ServiceAccount impersonation on all Reonciliation-type (Kustomization, HelmRelease) Flux CRs, for security reasons (more on this on the points above).

Nonetheless, for the Tenant this is transparent, he can omit the spec.ServiceAccountName and just specify the spec.kubeconfig.

For this reason the impersonation support has been introduced into Capsule Proxy (see https://github.com/clastix/capsule-proxy/issues/215).

I'm going to prepare a documentation for this scenario and propose some automations that could improve the UX for these GitOps-managed multi-tenancy scenarios.

bsctl commented 2 years ago

I wrote down these test definitions: ....

@maxgio92 Are these tests automated in e2e?

maxgio92 commented 2 years ago

@bsctl no, I'm not sure it would make so much sense as they wouldn't test Capsule but a use case integration with an external project.

bsctl commented 2 years ago

@maxgio92 you're right, my bad.

maxgio92 commented 2 years ago

Hey @oliverbaehler, we released the guide for this scenario https://capsule.clastix.io/docs/guides/flux2-capsule 🥳

If you want to take a look and you notice some improvement, correction, please let me know :-)

In any case, thank you a lot for the value you put on this! 🙏🏻