Epic: WYSIWYG Kubernetes Application Configuration

bgrant0607 commented 2 years ago

Most Kubernetes users are interested in configuring applications. That's the primary original purpose of Kubernetes, running containerized workloads. Even cluster services / add-ons are applications. GitOps is primarily focused on deploying applications, as well. Obviously it's the core use case for Helm.

So, what do we need to address in order to be able to handle applications in kpt?

[ ] #3118
[ ] #3119
[ ] #3210
[ ] #3125
[ ] #3280
[ ] We need to try set-namespace for this use case
[ ] We'll want a way to capture the Kubernetes deployment context automatically -- I've called this "mini-kubeconfig". It's also similar to the ArgoCD ApplicationSet target generator.
[ ] We need to figure out a reasonable variant-constructor pattern for the case of creating a specific application from a generic blueprint. Cluster services don't have this issue because there's just one per cluster usually.
[ ] https://github.com/GoogleContainerTools/kpt/issues/3155#issuecomment-1147571760: set-image, set-labels, etc. need to be able to specify its source locations so that ApplyReplacements isn't needed to use them.
[ ] More generally, we need a clear model for how to pass inputs to a package, especially deployable packages. #3396
[ ] We may want something similar to the Flux and ArgoCD image updaters to watch a container image repo and push updates for new images. I wonder if we could use or adapt one of those existing updaters.
[ ] New resource types in the Backstage plugin: Deployment, Service, Ingress, Gateway, GatewayClass, HTTPRoute, PersistentVolumeClaim, StatefulSet, DaemonSet, HorizontalPodAutoscaler. Support for external secrets. Prometheus Operator types. Types relevant to cluster add-ons, such as CustomResourceDefinition, ClusterRole, ClusterRoleBinding, MutatingWebhook, and ValidatingWebhook. Maybe others as we dig into some specific applications. Possibly also Istio resources and/or Argo Rollouts.
[ ] Support for commonly used recommended labels and well known annotations, such as kubectl.kubernetes.io/default-container
[ ] Support for prometheus annotations and OpenTelemetry environment variables
[ ] A function that generates RBAC Roles for the resource types in a package might be useful. Maybe there's something we can use from https://github.com/kubernetes-sigs/kubebuilder-declarative-pattern?
[ ] Linkage between the config UI and a live-state UI, such as the Kubernetes dashboard or, ideally, a Kubernetes backstage plugin
[ ] For cluster services / add-ons specifically, we'll likely want to use an app of apps pattern. In that case, we'd need to look at what we need to do to support RootSync and RepoSync in packages. For example, we'd probably want support for them in the backstage UI plugin. I can imagine needing a function to update pinned commits also.
[ ] Best practices for off-the-shelf blueprints. For example, last-mile customizations that are just general Kubernetes resource attributes and not specific to blueprint components could be omitted from the blueprints. They'd be handled in general authoring logic, such as in the backstage plugin and/or functions.
[ ] An approach for versioning off-the-shelf packages, similar to public helm charts. I insisted on sequential versioning in porch rather than semantic versioning in order to simplify the continuous deployment model. A concept of major version could be used to select the upstream blueprint revision stream. We'd eventually want to support rebase (#2548).
[ ] We will want to provision a namespace for the application prior to first deploying the application. That could be an interesting use case for dynamic dependencies, similar to crossplane provider package dependencies. As opposed to nested subpackages (#3343). https://fidelity.github.io/kraan/docs/design/ supports dependencies by "layer". Flux supports package-level dependencies: https://fluxcd.io/docs/components/kustomize/kustomization/#kustomization-dependencies.
[ ] A lifecycle annotation for enabling/disabling deployment of a resource, similar to local-config or the Config Sync manage annotation or a tombstone, but for the purpose of making blueprint resources optionally deployed, similar to the idea of disabling functions in the pipeline
[ ] We may want to support skaffold.dev config as well.
[ ] #3145: This is going to increase the surface area of the UI quite a bit. My guess is that some UX work will be required to make it less overwhelming.
- Lots of resource types, some of which, like pod, have lots of attributes
- Several resource cross references
- Multiple components
- Multiple deployments, such as for dev and prod
[ ] We may want to revisit versioning for off-the-shelf packages, particularly if we're going to build in more versioning functionality, as discussed in #2544.

My current opinion is that multi-cluster specialization and multi-cluster rollout is somewhat independent, but I may change that opinion as we dig into this more.

We plan to look at these common cluster services / add-ons as test cases:

[ ] Cert Manager
[ ] Nginx Ingress Controller
[ ] External DNS
[ ] Monitoring: Prometheus, AlertManager, Grafana, kube-state-metrics
[ ] Logging: ElasticSearch, Fluentd, Kibana

We should also try deploying all our own components: porch server and controllers, config sync, resource group controller, backstage.

At some point, we should also try the ghost application (chart, rendered) we looked at previously. That involved multiple components, so that's another case for dependencies and/or app of apps or static subpackages or dynamic dependencies. It's kind of unusual in that it's an off-the-shelf app rather than a bespoke app or off-the-shelf cluster component or off-the-shelf app platform like knative, kubevela, spark, kubeflow, etc.

Once we figure out how to natively handle applications, we can look into automating helm chart import, rendering and patching helm charts, and so on.

@selfmanagingresource @justinsb @droot

bgrant0607 commented 2 years ago

A few scenarios to try:

Write application configuration from scratch for a specific application (bespoke and off the shelf)
Publish the configuration for an off-the-shelf application
Deploy an off-the-shelf application
Deploy a group of off-the-shelf cluster add-ons
Customize a bespoke application configuration for multiple environments
Promote a new application image across environments
Create a generic application blueprint for an org for a typical kind of application
Customize a generic application blueprint for an org for a typical kind of application
Per-cluster customization for a bespoke app in addition to per-environment customization
Migrate from a helm chart, plain yaml, kustomize, cdk8s

justinsb commented 2 years ago

I agree that this is the direction that our users want us to go in, and I think this is a great list. We've started contributing some dogfooding PRs, which enable us to explore and prioritize the list of items you've identified here.

bgrant0607 commented 2 years ago

Not that we're lacking for example applications to try, but here's another: https://github.com/GoogleCloudPlatform/bank-of-anthos

It uses plain yaml, has a skaffold config, and has separate directories for prod and dev configs.

There's also: https://github.com/GoogleCloudPlatform/microservices-demo

bgrant0607 commented 2 years ago

An example from slack: https://github.com/treactor/treactor-helm https://github.com/treactor/treactor-kpt https://github.com/treactor/treactor-kpt-functions

bgrant0607 commented 2 years ago

I took another look at the kubernetes dashboard, portainer, k8syaml.com, k8od.io, lens, monokle, octant, the GKE UI, the Ambassador Labs clickops experience, and Humanitec.

If starting from scratch, starting with required fields makes sense to me. For a blueprint, we can stick with "example" for the names. In most cases other than the container image we could probably provide some defaults, such as labels, selectors, and ports.

If we want lots of defaults even in the blueprint authoring experience, I think the best way to provide those is with an upstream package.

After required fields, we should allow adding and editing of arbitrary fields, but we may want to group them by topic, such as scheduling or security, and sort them by frequency of use.

Dealing with multiple resources together should provide opportunities for autocompletion. For instance, the service selector could be the same as the deployment's selector by default, ports could be defaulted, etc., similar to kubectl run --expose.

Adding a ConfigMap could optionally mount it as a volume or inject the contents as environment variables. We might be able to guess which by looking at its contents. We may also want to be able to upload files and convert them to ConfigMaps using a function (#3119).

Our rule of thumb is that single values could just be edited in place. We'll need to think about how to direct users to those values that may need to be modified (#3145). We should identify cases where values need to be propagated to multiple places, which may suggest we need functions.

bgrant0607 commented 2 years ago

I thought about just optimizing for starting blueprints from blueprints, but that would create a chicken and egg scenario of how to create base blueprints. We could potentially provide some, but it still feels unsatisfying. The k8syaml experience does feel like this, though. In combination with the ability to select resources to enable/disable, such as Service, Ingress, and HPA, it could enable selecting from base blueprints for stateless apps, stateful apps (StatefulSet, PVC, and headless Service), and daemons.

bgrant0607 commented 2 years ago

The GKE Deploy experience starts with the container image, which it can autocomplete from GCR or AR.

It allows, but doesn't require, specification of the entry point and inline env vars.

Next it supports other configuration, which all has default values: name, namespace, labels. The labels are used for the pod template and selector.

It also creates a HPA, but not Service or Ingress, which I'd add options for. The Services and Ingresses page provides an option to select LoadBalancer Services to create an Ingress for.

From the details of a specific Deployment in the GKE UI, once it has already been created, there is a list of actions: autoscale (creates HPA), scale (edit replicas and resources), expose (create service), and update the image and update strategy knobs.

I'd also add advanced options, to expose the rest of the attributes, rather than just falling back to yaml editing. We possibly could do this via a generic form editor, which we will need to handle arbitrary CRDs: https://github.com/GoogleContainerTools/kpt-backstage-plugins/issues/68. For organizing a large number of options, I like the way the GKE cluster creation page breaks down options into groups of related attributes. k8syaml.com does a flavor of this also.

Here's an Openshift UI example: https://www.youtube.com/watch?v=jBDmX85IjLM

bgrant0607 commented 2 years ago

Some projects have embraced the struct-constructor approach.

gimlet.io builds a UI on top of this chart (and any chart with jsonschema for the values): https://github.com/gimlet-io/onechart/tree/master/charts/onechart/templates https://gimlet.io/concepts/onechart-concepts/

The UI can support other charts that provide JSON Schema, though.

kapitan similarly takes a one-generator approach, but uses jsonnet rather than helm templates, and supports multiple components, kind of like helmfile: https://github.com/kapicorp/kapitan-reference/blob/master/components/generators/kubernetes/README.md https://github.com/kapicorp/kapitan-reference/tree/master/lib

Effectively these are both thin abstractions over Kubernetes types that enable representation of resources as maps of attributes.

Kapitan supports some kustomize-like features, such as setting common values ("global defaults") across components.

Kapitan normalizes the configuration experience across multiple different tools and generated artifacts, but AFAICT lacks an explicit schema for the inventory. It aims to simplify human authoring, but does not aim to support automation above Kapitan.

https://github.com/zalando-incubator/stackset-controller does support automation on top because it's implemented using CRDs. https://github.com/clastix/capsule takes a similar all-in-one resource approach for Namespace provisioning.

johnbelamaric commented 2 years ago

There are certain common operations that you may want to do on an application deployment but not build into the base blueprint. For example, creating a PodDisruptionBudget or HPA. It would be useful to have functions to generate these and other similar "decorations" on a deployment.

johnbelamaric commented 2 years ago

Just to gather things in one place, a couple other examples for package dependencies:

1) Cloud provider SAs, KSAs, and associated workload identity setup (see https://github.com/GoogleContainerTools/kpt/pull/3388#discussion_r925126665) 2) Cluster Proportional Autoscaler (https://github.com/kubernetes-sigs/cluster-proportional-autoscaler)

johnbelamaric commented 2 years ago

There are certain common operations that you may want to do on an application deployment but not build into the base blueprint. For example, creating a PodDisruptionBudget or HPA. It would be useful to have functions to generate these and other similar "decorations" on a deployment.

One issue will be providing guidance on when should we just expect the user to add a resource and not provide any particular help, and when we should use a function to generate the resource, and whether that function should just be run imperatively or should be a declarative function.

johnbelamaric commented 2 years ago

Deployment of packages across cloud providers is another common thing we may want to see if we can help with. In some cases, this can be handled with separate packages that are dependencies as discussed above with cloud provider SAs, KSAs, and workload identity. But there are other provider-specific tweaks, often controlled by annotations. For example creating a LoadBalancer service can be scoped to the local VPC in GKE with a special annotation (https://cloud.google.com/kubernetes-engine/docs/how-to/internal-load-balancing). Presumably similar concepts exist in other cloud providers. Is this just left as an exercise for the author and/or deployer, or is there some assistance we can provide?

bgrant0607 commented 2 years ago

There are a lot of best-practice validators as well as security policy enforcement tools. It would be great it we could show not just validating those practices, but applying them -- make it so.

datree.io is one such tool, as shown in Viktor's video. Here's a video specifically about that: https://www.youtube.com/watch?v=3jZTqCETW2w.

Obviously there's gatekeeper, which has a mutation mode now.

And: https://github.com/instrumenta/kubeval https://github.com/yannh/kubeconform https://github.com/stackrox/kube-linter https://github.com/armosec/kubescape

Here's a list of tools: https://thechief.io/c/editorial/kubernetes-yaml-enforcing-best-practices-and-security-policies-cicd-and-gitops-pipelines/

bgrant0607 commented 2 years ago

@johnbelamaric

Regarding "decorations": I agree that's a reasonable concept. kubectl has some operations, like expose (creates a Service) and autoscale (creates HPA), that are designed with that philosophy. Ingress could be similarly generated for a Service of the appropriate type. Other resources that may be referenced from a Pod could be similar, such as ServiceAccounts, ConfigMaps (probably via generation), Secrets (via external secrets), and PVCs. These could have full sub-flows or we could add the reference, generate a minimal resource, possibly using a function, then steer the user to go edit the generated resources after, such as with a mechanism similar to the flagging mechanism used in the prototype, if any required values need to be checked or provided.

I'm pretty sure that resource creation should typically be imperative and interactive.

bgrant0607 commented 2 years ago

http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/#/create?namespace=_all

The kubernetes dashboard asks for the name, image, replicas, service type (none, internal, external), port (if service is selected).

Show advanced options expands: description (added as an annotation -- that's nice), labels, namespace, image pull secret, cpu and memory requests, command and args, environment variables, and whether to run as privileged. Still not all the options.

It puts documentation next to each form field, which is nice.

It also supports copy/pasting yaml and uploading from a file.

bgrant0607 commented 2 years ago

Discussion about a "default" pod spec: https://twitter.com/BretFisher/status/1550326044577730560

bgrant0607 commented 2 years ago

I could imagine wanting different UX for "create a blueprint for a specific off-the-shelf application", such as cert-manager, and "create a blueprint for a category of similar applications", such as Spring Boot apps, and maybe a "just deploy my app across multiple environments" scenario.

We've observed that, with a few exceptions (e.g. Prometheus, ElasticSearch), helm charts (mainly) and other config formats are much more widely used than Operators for running off-the-shelf applications. Of course, the Operators themselves somehow also need to be installed, but this suggests that Operators are not the main alternative.

bgrant0607 commented 2 years ago

We'll also want the UI guidance for creating and editing deployments to be different than for creating a blueprint.

bgrant0607 commented 2 years ago

This post describes a couple concrete application change scenarios: https://medium.com/kapitan-blog/why-your-kubernetes-configuration-strategy-is-broken-c54ff3fdf9c3

johnbelamaric commented 2 years ago

We'll also want the UI guidance for creating and editing deployments to be different than for creating a blueprint.

I would go so far as to say that the UI should focus on the "light authoring" workflows that are consumption-oriented. For "from scratch" package authoring, I think lower-level, CLI-based tooling will allow authors to use the IDEs and other tools of their choice.

bgrant0607 commented 2 years ago

@johnbelamaric

Serious question: In the consumption-oriented UX, what is the delightful "wow" experience we'd be aiming for, compared to a UI form for entering helm chart values?

Also:

~Zero CaD-friendly kpt blueprints are available currently. This means every new kpt user will need to author blueprints as the first thing they do.
Existing CLI- and IDE-based tools aren't great for from-scratch authoring. See the hacky kube-gen.sh script, for instance. And kubectl create only has implementations for a handful of the resource types of interest. IMO a big part of the value prop of CaD is making config authoring easier. Should we invest in CLI-based tools to reduce authoring friction now?
We're currently focused on off-the-shelf packages. Ok. Let's say that is the focus for now rather than bespoke workloads. We're converting helm charts, not writing the configurations from scratch. Users probably would do that, also. Do we want to focus specifically on the migration experience? Eventually we'll have to, but I was hoping to demonstrate the core value first.

johnbelamaric commented 2 years ago

Serious question: In the consumption-oriented UX, what is the delightful "wow" experience we'd be aiming for, compared to a UI form for entering helm chart values?

I think the "wow" is captured in "light authoring". Taking an off-the-shelf package, tweaking it with the "decorators" we have been discussing - like adding a PDB, or enabling/disabling TLS on an Ingress, adding an HPA, or even just tweaking a few fields here and there without any need for "package inputs" or similar rigidity. The magic to me is "I can make a change without changing the code of the upstream templates" - because of course we don't have "templates". The ability to diverge from upstream but still maintain the connection is powerful.

Full package authoring is done more rarely, by a smaller set of users. Those are also users more deeply soaked in config management, etc. Deriving, tweaking, and otherwise customizing packages is done by many users and we should have a much lower threshold of knowledge needed to perform those actions. That light authoring generally won't require creation of new functions, for example, but instead the discovery and execution of existing functions. I think we can get a significant "wow" from a broader audience by focusing on those.

~Zero CaD-friendly kpt blueprints are available currently. This means every new kpt user will need to author blueprints as the first thing they do.

There won't be until we (the kpt community) produce probably at least two dozen good examples and show the power.

Existing CLI- and IDE-based tools aren't great for from-scratch authoring. See the hacky kube-gen.sh script, for instance. And kubectl create only has implementations for a handful of the resource types of interest. IMO a big part of the value prop of CaD is making config authoring easier. Should we invest in CLI-based tools to reduce authoring friction now?

I think so, but not necessarily "general purpose". What I mean here is that the UX for "full package authoring" should allow those authors to come with their existing tools and integrate with our tools that focus on the kpt package authoring aspects. For example, @justinsb and @droot mentioned yesterday wanting some automated support in "dehydrating" packages. So, they can work on manifests in their test clusters, then run the tool to make it back into an abstract package. These are kpt CLI commands or adjacent tools to support that authoring process.

We're currently focused on off-the-shelf packages. Ok. Let's say that is the focus for now rather than bespoke workloads. We're converting helm charts, not writing the configurations from scratch. Users probably would do that, also. Do we want to focus specifically on the migration experience? Eventually we'll have to, but I was hoping to demonstrate the core value first.

Yeah, this one I am more ambivalent on. I do hear it from Nephio folks too. And given the massive investment in Helm charts it makes sense. But I still would focus on the previous point first, as then the workflow can be "render helm chart, then use those tools from the previous point".

bgrant0607 commented 2 years ago

We need distinct off-the-shelf package and bespoke app tracks. (I defined these terms in https://github.com/kubernetes/design-proposals-archive/blob/main/architecture/declarative-application-management.md)

Examples of bespoke app deployment: kubectl run, GKE UI, Kubernetes dashboard, Skaffold, Openshift UI (https://www.youtube.com/watch?v=jBDmX85IjLM), Ambassador Labs ClickOps over GitOps, Gimlet's OneChart, Kapitan, tanka, cdk8s, etc. Probably most Kubernetes deployment and CI/CD tools. In kpt, a blueprint is likely needed to promote across environments, not unlike a kustomize base or helm chart.

For off-the-shelf apps/components, those are predominately helm charts. They may be deployed via app catalogs, such as artifacthub.io, kubeapps.dev, plural.sh, Rancher's app marketplace, Lens, etc., or just specify the chart and values, as in the ArgoCD UI. The end consumer, the deployer, typically would just provide values.

What I showed in my talk was a mix of the platform team adaptation experience and the deployer experience. The "light authoring" would likely happen at the adaptation stage -- bringing an off-the-shelf component into an org and operationalizing it. A lot of charts are already prepared for that. For instance: https://github.com/cert-manager/cert-manager/blob/master/deploy/charts/cert-manager/templates/cainjector-deployment.yaml already parameterizes resources, security context, node selector, affinity, tolerations, pull policy, priority class, service account, etc. So if the chart is well maintained, forking may not be necessary. Of course, there are always cases. A recent one is converting Ingress to Gateway.

bgrant0607 commented 2 years ago

For the bespoke track, we previously used Spring Boot as a canonical class of application.

Here's a trivial tutorial that uses kubectl to deploy: https://codelabs.developers.google.com/codelabs/cloud-springboot-kubernetes#0

Here's one that has yaml to copy/paste: https://codersee.com/deploy-spring-boot-application-to-gke/

Here's one that also uses copy/paste, and includes istio: https://dzone.com/articles/deploy-a-spring-boot-microservice-architecture-to At the point where it needs to update an image, it says: "Update the deployment.yml file to reflect the new image name (line 28 in the file)". That's an opportunity for the set-image function.

Here's an example using helm and argocd: https://awstip.com/deploying-java-spring-app-to-gke-using-argocd-d84837113ce9 https://github.com/kusznerr/rafal-app-deployments/tree/main/helm/rafal-app-gke-prod/templates

Here's an example that I don't think has deployment config: https://github.com/gothinkster/spring-boot-realworld-example-app

There is a tool that generates basic deployment and service configs from source code annotations: https://dekorate.io/docs/spring-boot

Typical application config (hundreds of knobs): https://docs.spring.io/spring-boot/docs/current/reference/html/application-properties.html

bgrant0607 commented 2 years ago

Speaking of bespoke apps, Jenkins X has ways to promote across environments and such, and is using kpt in some capacity in jx project import and jx gitops upgrade. They are working to migrate to kpt v1. https://jenkins-x.io/ https://jenkins-x.io/v3/about/overview/projects/ https://jenkins-x.io/v3/devops/gitops/

It also uses kyaml in some commands to modify configuration in WYSIWYG style, which is nice, though it doesn't use KRM functions: https://jenkins-x.io/v3/develop/reference/jx/gitops/annotate/ https://github.com/jenkins-x-plugins/jx-gitops/blob/v0.7.27/pkg/cmd/annotate/annotate.go https://jenkins-x.io/v3/develop/reference/jx/gitops/namespace/ https://github.com/jenkins-x-plugins/jx-gitops/blob/v0.7.27/pkg/cmd/namespace/namespace.go https://jenkins-x.io/v3/develop/reference/jx/gitops/yset/ https://github.com/jenkins-x-plugins/jx-gitops/blob/v0.7.27/pkg/cmd/yset/yset.go

bgrant0607 commented 2 years ago

In the bespoke app track, we may want to try kubevela.io and knative, for comparison. Installing those could serve as examples of off-the-shelf apps.

bgrant0607 commented 2 years ago

Generation of skaffold.yaml: https://cloud.google.com/deploy/docs/using-skaffold/getting-started-skaffold#have_generate_your_skaffoldyaml

bgrant0607 commented 2 years ago

Though we are using off-the-shelf apps as example app configurations, an argument in favor of starting with bespoke apps for the UI-based demo is that we wouldn't need to build up a huge catalog of apps, helm is less dominant, the solution space is most fragmented, and UIs are at least sometimes used to author resources, which maybe has some benefit of familiarity. Obviously that's also where some users want higher-level abstractions, PaaSes, and so on. And CI/CD.

It also helps me to think about the user journeys, UX approach, and separation of concerns.

bgrant0607 commented 2 years ago

I started to sort out how I would manage the large surface area in the UI in the blueprint creation flow for a single bespoke application. There is more to do, but this is the working doc. https://docs.google.com/document/d/1JMLS3IMZB254arTpBK9BrlolTooTB-3sFC8nrP_EQms/edit

More guidance and structure will be needed than just forms that mirror the API spec. Pods have a large number of attributes.

We may also want flows that involve multiple resource types. Several are involved here that build on each other and reference each other.

And we'll need to decide where to leverage functions, as opposed to implementing functionality directly in the UI, and which functions should be used imperatively rather than being added to the Kptfile pipeline. The UI could depend on some well known functions, such as set-namespace.

bgrant0607 commented 2 years ago

vscode plugins we could compare with: https://thechief.io/c/editorial/kubernetes-and-vscode/ https://cloud.google.com/code/docs/vscode/yaml-editing https://code.visualstudio.com/docs/azure/kubernetes

kptdev / kpt

Epic: WYSIWYG Kubernetes Application Configuration #3351