aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.35k stars 3.77k forks source link

[aws-stepfunctions-tasks] Add support for SageMaker Processing #9537

Open tuliocasagrande opened 3 years ago

tuliocasagrande commented 3 years ago

SageMaker and Step Functions released a new integration to create processing jobs directly in the state machine:

Use Case

SageMaker Processing is a very flexible API call that can be used to run preprocessing/post-processing and therefore fully automate a ML use case using Step Functions.

Proposed Solution

Use https://github.com/aws/aws-cdk/blob/master/packages/%40aws-cdk/aws-stepfunctions-tasks/lib/sagemaker/create-training-job.ts as starting point.

Other


This is a :rocket: Feature Request

heatsink commented 3 years ago

To make sure that effort isn't duplicated:

ihorfito commented 3 years ago

Hello, any updates here ?

AustinGomez commented 2 years ago

Hi! Any updates here? :)

BenChaimberg commented 2 years ago

The furthest-along PR that would close this issue is #14633 but I am of the opinion that a design overhaul is needed. @kaizen3031593 can take another look to see if he agrees or if the PR can be progressed in its current state

spssmn-aws commented 2 years ago

@kaizen3031593 is there any further update for this PR please, I have a customer who needs to use this functionality in CDK

kaizencc commented 2 years ago

@spssmn-aws this is not in my immediate roadmap, but I would be happy to field community contributions on this. It looks like the PR #14633 is pretty stale at this point and didn't really get super far along in the API design process. Since this is a fairly complicated ask, the first step would be to iterate over the API design a few times before we get into the weeds of the implementation.

For anyone who needs this task (or any other task that doesn't have native stepfunctions-task support), you can create a custom state to do what you need to do.

As always, +1s can help me change my mind :).

ighosh98 commented 1 year ago

@kaizencc is there a reason why create-training-job extends sfn.TaskStateBase instead of using the custom-state approach?

I need to implement the CreateProcessingJob API for my team's requirements. Was wondering if there is a reason why one should prefer using sfn.TaskStateBase over customState?

kaizencc commented 1 year ago

@ighosh98 you can think of a custom state as a lower level API. If you want to just supply the properties in pseudo-json, custom state works well for you. If you want to create a new stepfunction task, then you extend sfn.TaskStateBase and build off of that.

athewsey commented 1 year ago

Bumping - any updates? Our team would benefit from this too

github-actions[bot] commented 2 weeks ago

This issue has received a significant amount of attention so we are automatically upgrading its priority. A member of the community will see the re-prioritization and provide an update on the issue.