aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.57k stars 3.88k forks source link

step-functions: AWS Glue Workflows #29001

Open Rizxcviii opened 8 months ago

Rizxcviii commented 8 months ago

Describe the feature

Currently, we can make use of CallAwsService, however it would be nice to have a dedicated Construct in place for the Glue Workflow to be used also.

Use Case

If we want to merge both workflows with step functions, this seems to make sense. Glue workflows can very well be easily reproduced to step functions, however some developers may prefer to use them, when wanting to use step functions to manage an overall data pipeline, and use workflows to handle the smaller ETL pipelines.

My specific use case is to trigger a large state machine that will handle multiple smaller glue workflows, reducing cost whilst still maintaining the original data pipeline.

Proposed Solution

Similar to Glue, we could add another construct GlueStartWorkflowRun.

new tasks.GlueStartWorkflowRun(this, 'Task', {
  glueWorkflowName: 'my-glue-job',
  ... //other parameters
});

Other Information

A quick query, more than anything. How are synchronous jobs triggered? For example, glue StartJobRun includes a .sync option, however I was wondering how that is reproduced within the CDK? The only reason I'm asking is the need for whether or not I should include that functionality also in a possible PR if this was to be worked on.

Acknowledgements

CDK version used

2.126.0

Environment details (OS name and version, etc.)

Windows 11

pahud commented 8 months ago

Yes this would be awesome! We welcome and appreciate any PR for that.