PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
15.95k stars 1.57k forks source link

Enable environment variables from GCP secrets in GCP cloud run v2 workers #15406

Closed Ultramann closed 1 week ago

Ultramann commented 1 week ago

Describe the current behavior

Environment variables are currently injected into cloud run v2 jobs from cloud run v2 workers by setting the env variable in a cloud run worker's job configuration. The variable is expected to be a simple key-value object, and gets translated into environment variables in a cloud run job request with this method.

https://github.com/PrefectHQ/prefect/blob/8f159b404126d93964a4daace7619bc553fa318c/src/integrations/prefect-gcp/prefect_gcp/workers/cloud_run_v2.py#L189-L195

This translation is required for the cloud run job api, docs. Those same docs specify how a GCP secret can be created as an environment variable, by passing something of the form

{
  "name": "ENV_VAR_NAME",
  "valueSource": {
    "secretKeyRef": {
      "secret": "MY_GCP_SECRET",
      "version": "latest"
    }
  }

Unfortunately, this api cannot be leveraged due to the way that plain text environment variables are created in CloudRunWorkerJobV2Configurations.

Describe the proposed behavior

As far as I can tell, the current limitation is in place to keep the api for all Prefect workers' environment variables consistent. As such, I'm guessing it makes sense to keep this part of the api in place, and add a new attribute, e.g. secrets or envFromSecrets, and logic which know how to create the GCP cloud run job api's valueSource environment variable kind.

This would enable the job_configuration in a cloud run v2 job worker template to have a new field and corresponding variable, secrets, akin to env. e.g.

{
  "job_configuration": {
    "secrets": "{{ secrets }}"
  },
  "variables": {
    "secrets": {
      "title": "Secrets as Environment Variables",
      "description": "Environment variables to set from GCP secrets when starting a flow run.",
      "type": "object",
      "additionalProperties": {
        "type": "string"
      }
    }
  }
}

Example Use

Deployments would be able to set secrets via job variables with

"secrets": {
  "ENV_VAR": {
    "secret": "SECRET_NAME",
    "version": "latest"
  }
}

Additional context

This enhancement request is similar to https://github.com/PrefectHQ/prefect/issues/13058, however, that issue seems to have been moved from the now archived prefect-gcp repo, and was for cloud run v1.

I understand that Prefect blocks could fulfill a similar role, but my organization's security requirement are to keep secrets in GCP Secret manager.

I'd be happy to work on this enhancement.

desertaxle commented 1 week ago

Thanks for the great writeup @Ultramann! I like envFromSecrets as a new key for specifying GCP secrets for the Cloud Run job. Since you said you'd like to work on this enhancement, would you want me to assign this issue to you?

Ultramann commented 1 week ago

@desertaxle thanks for the quick response! This is definitely a high priority item for us, so if it'd be faster for me to do the work, I can start on it immediately, then please assign the issue to me.

FWIW, this would be my first prefect contribution, so if we go that route I'd appreciate some pointers for what to do to have the PR merged quickly. I read the contribution docs; there don't seem to be notes on things like testing expectations, etc. Let me know if there's such a resource/standards, and I can definitely follow them.

One more thing on my mind. As far as I can tell, prefect is still installing prefect-gcp to fulfill the gcp extra, and it's not using the source in the integrations directory yet. If this is correct, do you have recommendations about how/where to get this change into a version of prefect we can use ASAP?

Ultramann commented 1 week ago

@desertaxle, I wanted to confirm that we have the fix to this issue deployed and working in production. Thanks for the quick feedback and code review!