kestra-io / plugin-gcp

Apache License 2.0
9 stars 10 forks source link

Add a GCP Dataform subplugin #458

Open anna-geller opened 1 year ago

anna-geller commented 1 year ago

Problem

We already support the open-source edition of Dataform: https://github.com/kestra-io/plugin-dataform

However, our users requested the ability to trigger Dataform jobs running on GCP Dataform service https://cloud.google.com/dataform?hl=en

API

The OSS version was implemented as a Node.js-CLI plugin. However, the GCP-specific plugin will likely only need to talk to GCP Dataform service via the REST API https://cloud.google.com/dataform/reference/rest

Specifically, the workflow invocation seems like the right endpoint https://cloud.google.com/dataform/reference/rest#rest-resource:-v1beta1.projects.locations.repositories.workflowinvocations

Possible syntax

id: dataform
namespace: dev
tasks:
    - id: transform
      type: io.kestra.plugin.gcp.dataform.InvokeWorkflow
      wait: true # wait for results by default so that if that job fails, this task fails as well
      # other properties from this request body https://cloud.google.com/dataform/reference/rest/v1beta1/projects.locations.repositories.workflowInvocations#WorkflowInvocation 

ideally, we should combine this with the list/get/query endpoints to allow polling for workflow invocation's results (wait: true)

drelum commented 1 year ago

Support for GCP Dataform service would be very useful.

anna-geller commented 1 year ago

for now done, we'll keep the issue open only to add GCP implementation

Ben8t commented 1 week ago

fyi, moving this one up in the prioritization as the issue got several upvotes 👍 will discuss with Mat if better to move to gcp or dataform repo

anna-geller commented 1 week ago

yup totally, seems also a fairly quick one