carvel-dev / kapp-controller

Continuous delivery and package management for Kubernetes.
https://carvel.dev/kapp-controller
Apache License 2.0
267 stars 102 forks source link

Package-level configuration and defaults for create, update, and delete strategies on resources #1189

Open st3v opened 1 year ago

st3v commented 1 year ago

This is a bit of a catch-all issue that touches on a bunch of things, all related to the way kapp creates, updates, and deletes resources and how this behavior is not necessarily clear to users interacting with Carvel packages rather than kapp directly. It seems like it would make sense to look at this as more of an epic that should be refined and broken down into smaller more concrete issues.

Problem Description

The following takes the Tanzu Crossplane package as a concrete example of a situation in which kapp's default strategies for creation and deletion of resources lead to issues:

  1. Crossplane is already installed in the cluster, along with the AWS provider. There are a number of external resources in AWS that have been provisioned via Crossplane (i.e. RDS instances, S3 buckets, etc).
  2. User installs or updates to the latest version of TAP which newly introduces a Crossplane package. They do not realize that there is such a package now and hence do not explicitly exclude it via the TAP values file.
  3. Crossplane package fails to install due to failing updates for certain properties on pre-existing resources (i.e. immutable selector in Crossplane controller deployment).
  4. User notices the installation failure for that package and subsequently excludes the package via their TAP values file.
  5. kapp removes all pre-installed Crossplane resources including all XRDs and Compositions.
  6. External resources in AWS are now orphaned and not managed by Crossplane anymore.

At the very core of the problem here is the fact that kapp silently adopts pre-existing resources and does that despite the error it throws in (3). To users interacting with tanzu package install this behavior is not obvious at all. Hence they do not realize that removing the failing PackageInstall will cause all pre-existing resources to be deleted.

Proposed Short-term Solution

For the particular case of the Crossplane package, we're currently planning to...

  1. Set --existing-non-labeled-resources-check via spec.deploy[0].kapp.rawOptions
  2. Anotate all Crossplane CRDs, Providers, and ProviderConfigs with kapp.k14s.io/delete-strategy=orphan
  3. Expose optional package values that allow users to configure (1) and (2). This way they can, for example, turn off (1) in order to reinstall the package after having installed and deleted it previously in the same cluster.

While this approach seems to be a good short-term solution for this particular package, one can argue that there's a need for a more general solution that addresses these kinds of problems at a higher level. After all, Crossplane isn't the only package that might run into issues like these. Another example is the cert-manager package.

Desired Long-term Outcomes

Think of the following list as suggestions for improvements to kapp-controller and kapp that came up during previous conversations.

alexbarbato commented 1 year ago

Thanks Stev!

renuy commented 1 year ago

@st3v , Thanks for creating this issue. The details mentioned here are very helpful

renuy commented 1 year ago

Adding this to the Prioritised backlog.