aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.72k stars 3.94k forks source link

aws-stepfunctions-tasks: workerType not dynamic from payload #32359

Open ElearG opened 1 day ago

ElearG commented 1 day ago

Describe the bug

I want to implement a state machine that invokes a glue job synchronously and that at the time of invoking it, the type of worker and the number of workers are passed as parameters from the input (payload). It is verified that the implementation can be achieved at the console level, however, if I want to deploy it through CDK, it is not possible, because the value of "WorkerType" is a class of the WorkerType type and does not allow it to be placed dynamically from the input. A capture of the implementation in the console and the attempt at implementation in CDK are attached.

CDK Repository Implementation image

image

- But here the error:

image

image

Regression Issue

Last Known Working CDK Version

No response

Expected Behavior

Define in CDK the workerType to be dynamic from the payload input in the state machine, in other words, achieve the image but doing it with cdk. I need that the argument "worker_type" not only accept a WorkerType Class that is a enum and only can put the define workers type insted of i need that also accept strings to pass "_sfn.JsonPath.string_at('$.glue_jobs_configs.executor_type')"in that argument

image

Current Behavior

Now, I can't put the workers type dynamic from payload input in the state machine because it only accept a class workerType that its a enum and only accepts the define workers. If i pass a string that references the payload input it shows an error.

image image

Reproduction Steps

In the step function task: GlueStartJobRun Put the argument worker_type the value _sfn.JsonPath.string_at('$.glue_jobs_configs.executor_type') worker_type= _sfn.JsonPath.string_at('$.glue_jobs_configs.executor_type') image

Possible Solution

The argument worker_type accept also strings and dont validate that its only a class WorkerType

Additional Information/Context

No response

CDK CLI Version

2.171.1

Framework Version

No response

Node.js Version

20.11.1

OS

Windows 11

Language

Python

Language Version

3.11

Other information

No response

ashishdhingra commented 20 hours ago

The code comment for WorkerType enum from aws-cdk-lib.aws_stepfunctions_tasks.WorkerConfigurationProperty states that If you need to use a WorkerType that doesn't exist as a static member, you can instantiate aWorkerTypeobject, e.g:WorkerType.of('other type').. However, this is incorrect since WorkerType is a enum, not a class. This might not be an overlook since per AWS Glue > Job runs, WorkerType accepts limited set of values. It might be a copy/paste error with excerpt taken from @aws-cdk/aws-glue-alpha > WorkerType class, which might be defined to support dynamic values (it's an experimental module).

Changing the type of property workerType in WorkerConfigurationProperty interface would be a breaking change and unsure if this should be done. Also, this property is used here in call to sfn.FieldUtils.renderObject(), which renders Parameters definition which is outputted in DefinitionString for AWS::StepFunctions::StateMachine resource in generated CFN template as something like - :states:::glue:startJobRun.sync","Parameters":{"JobName":"Test","WorkerType":"G.025X","NumberOfWorkers":3}}}}. There might not be a way to use escape hatch here. Perhaps introducing another property named workerTypeV2 which uses newly defined Worker Type class and deprecating old property workerType would be a feasible solution.

Needs review with the team.

ElearG commented 10 hours ago

Thanks for the reply. So I understand that a change must be made at the CDK level to make the workerType dynamic, right?

Is there another way to perform this operation?

I tried with the CallAwsService , Glue service, however I have problems with the integration_pattern, it does not accept synchronously (_sfn.IntegrationPattern.RUN_JOB)

Thanks for the support!

The code comment for WorkerType enum from aws-cdk-lib.aws_stepfunctions_tasks.WorkerConfigurationProperty states that If you need to use a WorkerType that doesn't exist as a static member, you can instantiate aWorkerTypeobject, e.g:WorkerType.of('other type').. However, this is incorrect since WorkerType is a enum, not a class. This might not be an overlook since per AWS Glue > Job runs, WorkerType accepts limited set of values. It might be a copy/paste error with excerpt taken from @aws-cdk/aws-glue-alpha > WorkerType class, which might be defined to support dynamic values (it's an experimental module).

Changing the type of property workerType in WorkerConfigurationProperty interface would be a breaking change and unsure if this should be done. Also, this property is used here in call to sfn.FieldUtils.renderObject(), which renders Parameters definition which is outputted in DefinitionString for AWS::StepFunctions::StateMachine resource in generated CFN template as something like - :states:::glue:startJobRun.sync","Parameters":{"JobName":"Test","WorkerType":"G.025X","NumberOfWorkers":3}}}}. There might not be a way to use escape hatch here. Perhaps introducing another property named workerTypeV2 which uses newly defined Worker Type class and deprecating old property workerType would be a feasible solution.

Needs review with the team.