[Python SDK/Google Cloud] Allow Pulumi created GCP project ID output syntax to be used as arg when creating other resources

Hello!

Vote on this issue by adding a 👍 reaction
If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

Using the Python SDK for Google Cloud (Classic) provider:

When you create a GCP project with gcp.organizations.Project(), you get an Output with created GCP project ID as output.id. This ID (IIRC) has the syntax of projects/GCP_PROJECT_ID.

When enabling GCP Services os assigning GCP IAM roles, you can use this output as their input argument for the project ID, as:

Having gcp_project_id = gcp.organizations.Project().id
gcp.projects.Service(NAME, project=gcp_project_id, service=API_NAME)
gcp.projects.IAMMember(NAME, project=gcp_project_id, role=ROLE, member=f'user:{EMAIL}')

So naturally, when you want to create a GCP resource, e.g. a GCS bucket, you'd want to use that gcp_project_id as an input argument:

bucket = gcp.storage.Bucket(NAME, name=BUCKET_NAME, location='US', project=gcp_project_id)

Yet this returns an error stating project is not recognized, because it expects project argument to have syntax GCP_PROJECT_ID instead of projects/GCP_PROJECT_ID.

What I'm also expecting from being able to achieve from this, is that the GCP resources will automatically depend on the previous creation of the GCP project by Pulumi.

I believe this could be caused because the resources expect the project argument to be a string provided manually by the user or from the Pulumi project conf, in a working environment where the GCP project is already created and we're using Pulumi to deploy resources into it, not for the same Pulumi stack to create one/many project/s and then deploy resources in them.

Although I love Pulumi for now, I'm pretty novice and maybe I'm mistaken in one or many points. Please educate me if that's the case!

Thanks for taking your time to study this.

Affected area/feature

Python SDK
Google Cloud (Classic) provider

Transferring this to the GCP repo and someone will get back to you soon!

In the meantime, two quick notes:

Don't you need to pass-in the project ID when creating the gcp.organizations.Project()?

Not sure if this is the best way to do it (how stable the format of Project.id is), but you could manipulate the value of the ID to get just the last part of it inside an apply:

 gcp_project_id = gcp.organizations.Project().id
 id = gcp_project_id.apply(lambda s: s.split("/")[-1]) # return the last part of the ID
 bucket = gcp.storage.Bucket(NAME, name=BUCKET_NAME, location='US', project=id)

Hi @Indavelopers, thank you for opening this issue.

In general, we follow upstream development closely for enhancements.

Please try @justinvp's suggestion and let us know if it resolves your use case for now.

@justinvp thank you very much for taking your time to respond to this personally, this was quite a surprise actually. I'm pretty in love with Pulumi as an IaC tool and you answering on GH added to that.

TL;DR - your suggestion fixed my problem, but I still find my FR interesting as the expected args may still be inconsistent

Pulumi version

(.venv) marcos@Ideapad-Indavelopers:~/gcp-training-projects/stacks$ pulumi version && pip list | grep pulumi
v3.116.1
pulumi     3.116.1
pulumi_gcp 7.23.0

Project

My project and step-by-step instructions is public: [https://github.com/Indavelopers/gcp-training-projects/tree/github_test] It aims to provide a tool, guide and -in the future- templates for creating GCP projects with templated resources automatically for running GCP workshops, trainings, etc., and giving the students projects to follow exercises.

I explain this to state that, in this particular case, I'm not following the common case (AFAIK) of using Pulumi to deploy resources into an already created GCP projects, but to create multiple GCP projects, then applying config and creating resources in them

Testing the suggestion

@justinvp suggestion:

gcp_project_id = gcp.organizations.Project().id
id = gcp_project_id.apply(lambda s: s.split("/")[-1]) # return the last part of the ID
bucket = gcp.storage.Bucket(NAME, name=BUCKET_NAME, location='US', project=id)

It worked, you can fin it adapted to my need on gcp_course_infra.py, line 15

Therefore, line 31 is able to reference each GCP project ID when creating each GCP project's bucket in a loop.

Yet, in main.py, lines 45 & 52, I was able to reference GCP project IDs when enabling APIs and assigning IAM roles without having to convert the outputs ID (line 40) first.

Therefore, same-ish args - GCP project ID - in 2 situations - enabling services & assigning roles, and creating resources - use a different syntax, and the syntax for GCP resources doesn't allow to directly use the GCP project ID Output after creation with Pulumi.

I believe maybe internally stripping the project/ from project/GCP_PROJECT could be added securely, as the "/" character is not allowed in GCP project IDs.

LMK if I could/you'd prefer me to add that as maybe a PR to Pulumi Python SDK for GCP?

@guineveresaenger thank you for your response, although I'm afraid I didn't fully understand it - what's the upstream development here? Should I open an issue or submit a PR elsewhere?

Thanks all,

While we're on that, I updated my code to show another issue I encountered with apply():

In my code: gcp_course_infra.py

# Create a GCS bucket
bucket_urls = []
for gcp_project_id, generated_project_id in zip(gcp_project_ids, generated_project_ids):
    # TODO
    print('LOGGING LOOP gcp_project_ids:')
    print(gcp_project_id)
    print(type(gcp_project_id))

    bucket = gcp.storage.Bucket(gcp_project_id, name=gcp_project_id + '-bucket-test', location='US', project=gcp_project_id)
    bucket_urls.append(bucket.url)

pulumi.export('bucket_urls', bucket_urls)

As I'm deploying resources to multiple GCP projects managed by Pulumi, and in this case during a training/course/workshop, I want to name resources referencing their GCP project ID and in a deterministically way, in order for commands in instructions to work or for the trainer to be able to locate those resources across projects.

This is why I tried to use the Pulumi output after creating a GCP project ID as the name of the Pulumi bucket resource and GCP bucket ID (adding -bucket-test here).

This unfortunately throws an error:

(.venv) marcos@Ideapad-Indavelopers:~/gcp-training-projects/stacks$ pulumi preview
Previewing update (gcp_course):
     Type                 Name                              Plan     Info
     pulumi:pulumi:Stack  gcp-training-projects-gcp_course           1 error; 13 messages

Diagnostics:
  pulumi:pulumi:Stack (gcp-training-projects-gcp_course):
    error: Program failed with an unhandled exception:
    Traceback (most recent call last):
      File "/home/marcos/gcp-training-projects/stacks/__main__.py", line 61, in <module>
        importlib.import_module(infra_script)
      File "/usr/lib/python3.12/importlib/__init__.py", line 90, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
      File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 995, in exec_module
      File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
      File "/home/marcos/gcp-training-projects/stacks/gcp_course_infra.py", line 31, in <module>
        bucket = gcp.storage.Bucket(gcp_project_id, name=gcp_project_id + '-bucket-test', location='US', project=gcp_project_id)
                                                         ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
    TypeError: unsupported operand type(s) for +: 'Output' and 'str'

    LOGGING BEFORE gcp_project_ids:
    [<pulumi.output.Output object at 0x7f47c87114c0>, <pulumi.output.Output object at 0x7f47c8712450>, <pulumi.output.Output object at 0x7f47c8713440>]
    <class 'pulumi.output.Output'>
    LOGGING AFTER gcp_project_ids:
    [<pulumi.output.Output object at 0x7f47c866ade0>, <pulumi.output.Output object at 0x7f47c866b020>, <pulumi.output.Output object at 0x7f47c866b260>]
    <class 'pulumi.output.Output'>
    LOGGING LOOP gcp_project_ids:
    Calling __str__ on an Output[T] is not supported.
    To get the value of an Output[T] as an Output[str] consider:
    1. o.apply(lambda v: f"prefix{v}suffix")
    See https://www.pulumi.com/docs/concepts/inputs-outputs for more details.
    This function may throw in a future version of Pulumi.
    <class 'pulumi.output.Output'>

I'd expect apply() to actually "apply" the function, and for example change the variable type from Output to str here (gcp_course_infra.py, line 15), but it's not the case and for example here, I can't use the + operator to use the IDs as strings.

I suppose I could use apply() for each resource, but this would be cumbersome and repetitive.

@justinvp do you happen to have any suggestion I could use here? Thanks!

Hey @Indavelopers thanks for your suggestions and sorry you are having issues here. It is quite unfortunate that the project_id parameter is indeed inconsistent between resources in their usage of the "projects/" prefix.

LMK if I could/you'd prefer me to add that as maybe a PR to Pulumi Python SDK for GCP?

Appreciate your suggestion here but that may be somewhat involved - the GCP python sdk is generated from the GCP pulumi provider. In turn that uses the GCP TF provider: https://github.com/hashicorp/terraform-provider-google (This is what @guineveresaenger meant by "upstream"). The change you suggested would have to be made there and it'd have to be made in a non-disruptive way to code which depends on it. If you are interested, please raise an issue there and take a look at their documentation on contributing: https://googlecloudplatform.github.io/magic-modules/

While we're on that, I updated my code to show another issue I encountered with apply():

As for your second question with apply here, I think the issue is that you're assuming the apply will resolve the output values for the rest of the python program, which is not the case. There's some additional docs here: https://www.pulumi.com/docs/concepts/inputs-outputs/apply/ which are likely better written but I'll try to summarise what is happening here:

The apply sets up a callback to be called after the output values are resolved - the function inside the apply will have the plain string values.

The loop outside of the apply is called before the output values are resolved - indeed the whole python program is executed at in order for the engine to find all the resources, learn about dependencies etc. If your IDE supports that, the type hints can be very useful to get an intuition about which values are Outputs and which are resolved.

Instead of bucket = gcp.storage.Bucket(gcp_project_id, name=gcp_project_id + '-bucket-test', location='US', project=gcp_project_id)

You should try using the apply there: bucket = gcp.storage.Bucket(gcp_project_id, name=gcp_project_id.apply(lambda proj: proj + '-bucket-test'), location='US', project=gcp_project_id)

Thank you very much @VenelinMartinov, your insight into apply() really allowed me to understand this.

As you're right, my confusion maybe came from the docs. As a technical trainer/writer, I'll try to propose a PR in the repo docs to explain this behaviour and what to expect from apply().

As per the referred "upstream", I see this is another point for the ongoing discussion about using the GCP provider (Classic) and the native one. Looking forward then to the first native provider release!

Continuing the thread of the conversation, but no longer referring to the original topic of the issue

Oh snap @VenelinMartinov, I'm afraid your suggestion doesn't work. Do you have any insight as of why? You can check my code at gcp_course_infra.py, line 17

I even tried to use apply for the Pulumi bucket object name, just in case.

Still throws the same error:

    error: Program failed with an unhandled exception:
    Traceback (most recent call last):
      File "/home/marcos/gcp-training-projects/stacks/__main__.py", line 61, in <module>
        importlib.import_module(infra_script)
      File "/usr/lib/python3.12/importlib/__init__.py", line 90, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
      File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 995, in exec_module
      File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
      File "/home/marcos/gcp-training-projects/stacks/gcp_course_infra.py", line 17, in <module>
        bucket = gcp.storage.Bucket(gcp_project_id.apply(lambda proj: str(proj)), name=gcp_project_id.apply(lambda proj: proj + '-bucket-test'), location='US', project=gcp_project_id)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/marcos/gcp-training-projects/stacks/venv/lib/python3.12/site-packages/pulumi_gcp/storage/bucket.py", line 1160, in __init__
        __self__._internal_init(resource_name, *args, **kwargs)
      File "/home/marcos/gcp-training-projects/stacks/venv/lib/python3.12/site-packages/pulumi_gcp/storage/bucket.py", line 1227, in _internal_init
        super(Bucket, __self__).__init__(
      File "/home/marcos/gcp-training-projects/stacks/venv/lib/python3.12/site-packages/pulumi/resource.py", line 1118, in __init__
        Resource.__init__(self, t, name, True, props, opts, False, dependency)
      File "/home/marcos/gcp-training-projects/stacks/venv/lib/python3.12/site-packages/pulumi/resource.py", line 865, in __init__
        raise TypeError("Expected resource name to be a string")
    TypeError: Expected resource name to be a string

@Indavelopers, yeah, my bad - the GCP bucket name parameter is a plain string type, not an Input, so does not work with output types. https://www.pulumi.com/registry/packages/gcp/api-docs/storage/bucket/#name_nodejs

I suspect that's the case for most name parameters as the pulumi engine needs to know the name of the resource ahead of time in order to check for conflicts etc.

In your case you might want to use the prefix from generated_project_ids instead of gcp_project_ids.

pulumi / pulumi-gcp