pulumi / pulumi-kubernetes

A Pulumi resource provider for Kubernetes to manage API resources and workloads in running clusters
https://www.pulumi.com/docs/reference/clouds/kubernetes/
Apache License 2.0
406 stars 116 forks source link

[Helm] Service resource Output.apply does not wait for resource to actually exist #1215

Closed iridian-ks closed 4 months ago

iridian-ks commented 4 years ago

Problem description

I am using the Helm resource to deploy a helm chart. I am then grabbing a particular Service that helm chart provides. I then want to grab whatever IP/NodePorts exist in that Service. Instead of waiting for the Service to exist the .apply methods simply returns right away with user input. If you deploy a Service then Kubernetes will update the YAML with the IP's and NodePorts.

Errors & Logs

This is the FINAL TICK TOCK from the logs.

    debug: TICK TOCK
    debug: {'type': 'ClusterIP', 'ports': [{'protocol': 'TCP', 'target_port': 'http', 'port': 80.0, 'na
me': 'http'}, {'port': 443.0, 'name': 'https', 'protocol': 'TCP', 'target_port': 'https'}], 'selector':
 {'app.kubernetes.io/name': 'nginx', 'app.kubernetes.io/instance': 'toy'}, 'session_affinity': 'None',
'cluster_ip': '10.43.112.241'}

This looks like line 17 in the example below.

Affected product version(s)

v2.6.1

Reproducing the issue

Should be something like this:

import pulumi
import pulumi_kubernetes.helm.v3 as helm

def _debug(val):
    pulumi.debug("TICK TOCK")
    pulumi.debug(str(val))

def _write(val):
    with open("output.log", "a") as fd:
        fd.write(str(val))

def _cluster_ip(resources):
    _debug(resources)
    service = resources["v1/Service:default/toy-nginx"]
    _debug(service)

    service.spec.apply(_debug)

    # results in a fatal because node port is not set
    return service.spec.apply(lambda spec : spec["clusterIP"])

release = helm.Chart(
    release_name = "toy",
    config = helm.ChartOpts(
        namespace = "default",
        chart = "nginx",
        values = {
            "service": {
                "type": "ClusterIP",
            },
        },
        fetch_opts = helm.FetchOpts(
            repo = "https://charts.bitnami.com/bitnami"
        ),
    ),
)

cluster_ip = release.resources.apply(_cluster_ip)
cluster_ip.apply(_write)

Debug output will show that only the user input (yaml) will be returned. I feel like the debug shouldn't run because it should wait for the resource to exist before running the apply, but it's clearly running during preview.

Suggestions for a fix

I'm not a Pulumi expert, but other Pulumi resources will definitely wait prior to running .apply until the resource actually exists.

EDIT:

The actual code I'm working with has too much context to paste here. I rewrote the example above to simplify what I'm trying to do. With this smaller example it seems like I'm getting the inverted issue, but I'm actually not entirely sure. All I know is the final lambda/_write out are never executing.

leezen commented 4 years ago

The panic seems odd and unexpected, but looking at your outputs, shouldn't you be trying to access spec['cluster_ip'] instead of spec["clusterIP"]? The Service resource should wait until it's provisioned before populating output properties per https://www.pulumi.com/docs/reference/pkg/kubernetes/core/v1/service/

iridian-ks commented 4 years ago

OK, yeah I fixed the clusterIP thing and now the output.log does in fact get the IP's written ... why doesn't it panic if it's set to clusterIP? Maybe I have typos in the other parts of my code that I don't know about because Pulumi is silently continuing? Going to investigate. I'll report back in a bit.

iridian-ks commented 4 years ago

@leezen thanks for the quick response. I dug further and found out exactly when issues come up.

import pulumi
import pulumi_kubernetes.helm.v3 as helm

def _debug(val):
    pulumi.debug("TICK TOCK")
    pulumi.debug(str(val))

def _split(val):
    return val.split(".")

def _cluster_ip(spec):
    return spec["cluster_ip"]

def _find_spec(resource_name, callback):
    def find_spec(resources):
        if resource_name not in resources:
            raise LookupError()

        return resources[resource_name].spec.apply(callback)

    return find_spec

class Component(pulumi.ComponentResource):
    def __init__(self, resource_name):
        super().__init__("toy:helm:Nginx", resource_name, {}, None)

        release = helm.Chart(
            release_name = "toy",
            config = helm.ChartOpts(
                namespace = "default",
                chart = "nginx",
                values = {
                    "service": {
                        "type": "ClusterIP",
                    },
                },
                fetch_opts = helm.FetchOpts(
                    repo = "https://charts.bitnami.com/bitnami"
                ),
            ),
        )

        cluster_ip = release.resources.apply(
            _find_spec("v1/Service:default/toy-nginx", _cluster_ip)
        )

        # works
        cluster_ip.apply(_split)

        # causes panic
        self.register_outputs(dict(
            ip_pieces=cluster_ip.apply(_split)
        ))

Component("toy")

Here's the relevant stack trace:

      File "/usr/local/lib/python3.7/site-packages/pulumi/output.py", line 176, in run
        transformed: Input[U] = func(value)
      File "./__main__.py", line 12, in _cluster_ip
        return spec["cluster_ip"]
    KeyError: 'cluster_ip'

This happens during the preview. It shouldn't be trying to run this during the preview though.

iridian-ks commented 4 years ago

This also applies when using the value for other resources. I'm doing something similar to register DNS in Route53 and getting the same thing.

iridian-ks commented 4 years ago

OK, so the original title might be accurate? Just as a silly example here's the new reproduction code minus the first 26 lines which are the same from the previous example:

class Component(pulumi.ComponentResource):
    def __init__(self, resource_name, ip=None):
        super().__init__("toy:helm:Nginx", resource_name, {}, None)
        opts = pulumi.ResourceOptions(parent=self)

        release = helm.Chart(
            release_name = resource_name,
            config = helm.ChartOpts(
                namespace = "default",
                chart = "nginx",
                values = {
                    "ip": ip,
                    "service": {
                        "type": "ClusterIP",
                    },
                },
                fetch_opts = helm.FetchOpts(
                    repo = "https://charts.bitnami.com/bitnami"
                ),
            ),
            opts = opts,
        )

        self.resources = release.resources

component = Component("toy")
cluster_ip = component.resources.apply(
    _find_spec("v1/Service:default/toy-nginx", _cluster_ip)
)
# causes panic
Component("demo", cluster_ip)
      File "/usr/local/lib/python3.7/site-packages/pulumi/output.py", line 176, in run
        transformed: Input[U] = func(value)
      File "./__main__.py", line 16, in _cluster_ip
        return spec["cluster_ip"]
    KeyError: 'cluster_ip'
    error: an unhandled error occurred: Program exited with non-zero exit code: 1

It looks like if the pulumi.Output object is not used in another resource then the .apply methods will wait for the resource to exist, which is why the logging example worked. But if the Output object is used in another resource then the .apply methods are unwrapped immediately without waiting for the resource to exist. Is this how it's supposed to be?

leezen commented 4 years ago

No, in general, the callback to an apply is invoked when the output is known. The resource graph is constructed such that dependencies are tracked so that the callbacks are invoked when needed.

@lblackstone I would expect cluster_ip to depend on component.resources['foo_service'].spec and the callback to not be invoked until the Service exists? Does that sound right to you or am I missing something?

lblackstone commented 4 years ago

We likely have a bug in the Python SDK. As you said, the apply should wait until the resources are ready.

iridian-ks commented 4 years ago

Thanks for looking at this @lblackstone . Do you think this will get fixed in the next release or so?

komalali commented 3 years ago

Related to https://github.com/pulumi/pulumi-kubernetes/issues/861

EliasGabrielsson commented 3 years ago

I experience similar behavior. The .apply function in example below will not run after the real value of cert is return from AWS services. The cert are being fetched as rerunning pulumi up will output the certificate.

Version:

pulumi v2.16.2

How to reproduce

from pulumi_aws import iot

def save_cert_to_disk(args): # args[0]=certificate, args[1]=filename
    pulumi.debug("Save_cert_to_disk called. {}, {}".format(args[0], args[1]))

cert=iot.Certificate(
      name, 
      csr=csr.public_bytes(serialization.Encoding.PEM).decode("utf-8"),
      active=True)

# Output.all = Used to pass multiple async values to function: save_cert_to_disk()
Output.all(cert.certificate_pem, "filename").apply(save_cert_to_disk)
lblackstone commented 1 year ago

You can use the depends_on property to work around this problem now. #1971 is tracking a fix that won't require this.

EronWright commented 4 months ago

The ability to wait for the resources to actually exist is fixed in Chart v4, using the ordinary depends_on option.