pulumi / pulumi-kubernetes

A Pulumi resource provider for Kubernetes to manage API resources and workloads in running clusters
https://www.pulumi.com/docs/reference/clouds/kubernetes/
Apache License 2.0
402 stars 115 forks source link

Support user-specified readiness/await logic #1260

Closed lblackstone closed 1 day ago

lblackstone commented 4 years ago

Problem description

We already provide readiness/await logic for common k8s resources today, but this doesn't adequately cover CustomResources since each one is different. For k8s operators in particular, it's common to specify a state and then wait for a particular condition to be met before further action is taken.

In many cases, checking for readiness requires checking a field in a k8s resource against a known value (e.g. "Ready"). It's possible to watch the k8s Event stream for updates, so we should provide some mechanism for users to specify Events + fields of interest and use that as a dependency in the Pulumi resource graph.

Here's an example showing one possible way of implementing this support: https://gist.github.com/lblackstone/d2d1c29507685c0833612988945dedf3

Related issues: https://github.com/pulumi/pulumi-kubernetes/issues/912 https://github.com/pulumi/pulumi-kubernetes/issues/1056

marioapardo commented 3 years ago

Can the example be done in the Python SDK? or is it just for JS right now?

ghostsquad commented 2 years ago

any news on this front? I feel like this is really important functionality that was made easy in ArgoCD, as they allowed you to write some simple Lua to perform health checking.

https://argo-cd.readthedocs.io/en/stable/operator-manual/health/#custom-health-checks

I feel like if I could figure out, even a temporary work around, to add health checks for CRDs, then I could use the Pulumi K8s Operator to deploy Pulumi k8s stack resources to k8s clusters, doing so in order based on health, and essentially implement the ArgoCD App of Apps pattern within Pulumi, and further simplify my workflows. I would need pulumi to check the health of the stack it deployed (to a different namespace/cluster).

lblackstone commented 2 years ago

any news on this front?

We're still thinking about how to support this generically in the provider.

Relatedly, we recently started factoring out the await logic into a separate library, which would make it possible to contribute checks for additional resource types. This will also require work on the provider side to make use of any new checks, but I expect that we will eventually support this more dynamically.

kralikba commented 2 years ago

Has there been any update to this? I understand that this is a major design challenge. In my use case, specifying an expected value for a field of the CRD would also be a simple solution. Right now I have to resort to too many hacks using dummy Outputs and a non-pulumi kubernetes client. (I'm using C#)

ghostsquad commented 2 years ago

@kralikba what if in the interim, each resource was wrapped in a component resource, which could be the object and a [https://www.pulumi.com/registry/packages/command/api-docs/provider/](command resource), that called out to a CLI like https://github.com/cakehappens/seaworthy ?

Seaworthy is your post-apply validation that your K8s resources deployed correctly and are healthy.

This CLI is still just in a POC state.


I think the main problem with Pulumi allowing custom health checking, is that in some languages, like Go, you can't serialize functions, and the code that we all write, is just creating a schema of desired state, which is serializable, as just a series of interconnected objects.

I too am interested to know what kinds of problems are blocking this issue.

jabbrwcky commented 1 year ago

Hey, I stumbled across this issue while looking for a way to wait on cert-manager having issued a certificate before proceeding.

I see the issue with serializing logic, but would it not at least be possible to express resource readiness constraints for resources that implement the recommended conditions for the status subresource.

The cert-manager Certificate CR does implement status this way, e.g.:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  annotations:
    pulumi.com/patchForce: 'true'
    pulumi.com/timeoutSeconds: '60'
  name: my-cert
  namespace: zammad-stage-system
spec:
  # ...
status:
  conditions:
    - lastTransitionTime: '2023-01-04T13:37:56Z'
      message: Certificate is up to date and has not expired
      observedGeneration: 1
      reason: Ready
      status: 'True'
      type: Ready
  notAfter: '2023-04-04T13:37:56Z'
  notBefore: '2023-01-04T13:37:56Z'
  renewalTime: '2023-03-20T13:37:56Z'
  revision: 1

It might just be passed as part of resource options as plain data:

const tls = new k8s.apiextensions.CustomResource(`${name}-certificate`, {
    apiVersion: "cert-manager.io/v1",
    kind: "Certificate",
    spec: { /* ... */ }
},{
    readyConditions: [{
        status: 'True',
        type: 'Ready',
        // reason: '...',
    }]
})

This way no function serialization would be required for static language support?

lblackstone commented 1 year ago

No concrete updates to share at this time, but it might be worth thinking through a solution specific to CRs, since this scenario comes up frequently. It seems like the most common case there is checking the value of a known field, which seems doable without needing function serialization.

On a related note, we've recently done some exploratory work to fix https://github.com/pulumi/pulumi/issues/6948, which has some similarities to this problem.

jabbrwcky commented 1 year ago

For comparison, the Terraform Kubernetes provider supports a wait{} block (at least for manifests), which apparently uses sth. like JsonPath to reference fields of a resource.

lkt82 commented 9 months ago

Hi @lblackstone any updates on this :)?.

matanbaruch commented 9 months ago

+1

lblackstone commented 8 months ago

I'm not working directly in this area of Pulumi anymore, but I'd suggest that anyone interested add a 👍 to the issue. This is one of the signals that the team uses to prioritize work, so make sure to do that for any issues that are important to you!

EronWright commented 5 months ago

Short of having fully extensible await logic, would a useful feature be to support waiting for certain status conditions, e.g. Ready. I believe that kubectl wait and kubectl rollout status has functionality similar to this. And TF has wait conditions.

lblackstone commented 1 day ago

🎉

blampe commented 23 hours ago

For @strideynet @suraciii @mcavoyk @ghostsquad @joshlreese @marioapardo @kralikba @jabbrwcky @lkt82 @matanbaruch and anyone else watching this issue, we have user-defined await logic available as a pre-release (version 4.18.0-alpha.1724335757) and we would love to hear your feedback!

From the change log:

EXPERIMENTAL: The pulumi.com/waitFor annotation was introduced to allow for custom readiness checks. This override Pulumi's own await logic for the resource (however the pulumi.com/skipAwait annotation still takes precedence).

The value of this annotation can take 3 forms:

  1. A string prefixed with jsonpath= followed by a JSONPath expression and an optional value.

    The JSONPath expression accepts the same syntax as kubectl get -o jsonpath={...}.

    If a value is provided, the resource is considered ready when the JSONPath expression evaluates to the same value. For example this resource expects its "phase" field to have a value of "Running":

    pulumi.com/waitFor: "jsonpath={.phase}=Running"

    If a value is not provided, the resource will be considered ready when any value exists at the given path, similar to kubectl wait --for jsonpath=.... This resource will wait until it has a webhook configured with a CA bundle:

    pulumi.com/waitFor: "jsonpath={.webhooks[*].clientConfig.caBundle}"

  2. A string prefixed with condition= followed by the type of the condition and an optional status. This matches the behavior of kubectl wait --for=condition=... and will wait until the resource has a matching condition. The expected status defaults to "True" if not specified.

    pulumi.com/waitFor: "condition=Synced"

    pulumi.com/waitFor: "condition=Reconciling=False"

  3. A string containing a JSON array of multiple jsonpath= and condition= expressions.

    pulumi.com/waitFor: '["jsonpath={.foo}", "condition=Bar"]'

zlepper commented 14 hours ago

Is there a way to specify these from the crd side? We have a bunch of things we have implemented as crds and then exposed as pulumi resources where this could be applicable, and I would love to do it on that side rather than having to add annotations to every usage of the resources?