aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.66k stars 3.92k forks source link

batch: Cannot omit the start/end of target node on batch.MultiNodeContainer #29415

Open jalencato opened 8 months ago

jalencato commented 8 months ago

Describe the feature

When using CDK to deploy aws batch multi-node jobs, we have to specify both the start_node and end_node, FYI: https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_batch/MultiNodeContainer.html#aws_cdk.aws_batch.MultiNodeContainer. But refer to the https://docs.aws.amazon.com/batch/latest/APIReference/API_NodeRangeProperty.html#API_NodeRangeProperty_Contents, it is possible to omit start_node/end_node here.

Use Case

Our use case is as following: after I deploy the aws batch cloud infrastructure, I can use boto3 with python to submit a job like:

import boto3 

response = batch_client.submit_job(
    jobName=job_name,
    jobQueue=job_queue,
    jobDefinition=multi_job_definition,
    parameters=job_parameters,
    nodeOverrides={
        "numNodes": overridden_num_nodes,
    },
)

Currently It will throw error like:

botocore.errorfactory.ClientException: An error occurred (ClientException) when calling the SubmitJob operation: NumNodes override can only be applied if the job definition has at least 1 target node without a range_end i.e (:) or (range_start:).

This is because in CDK we have to specify the start_node and end_node. But if we support to omit the end_node in CDK, we can avoid this problem. And this is valid according to the aws batch multi-node job definition. Currently the only work around is to create another job_definition based on what we deploy and modify the target node in the batch console.

Proposed Solution

No response

Other Information

No response

Acknowledgements

CDK version used

2.131.0

Environment details (OS name and version, etc.)

Amazon Linux 2

pahud commented 8 months ago

https://docs.aws.amazon.com/batch/latest/APIReference/API_NodeRangeProperty.html#API_NodeRangeProperty_Contents, it is possible to omit start_node/end_node here.

Yes looks like the ending node index can be omitted. I guess we need a PR to get it fixed.

jalencato commented 5 months ago

Hi Team, any updates on the issue? I see the PR is there for several weeks.

shikha372 commented 5 months ago

Hey @jalencato , we are currently looking into how to fix this issue without introducing any breaking change for the others.

thvasilo commented 4 months ago

Hi @shikha372 is there a suggested workaround currently? I guess using a Cfn construct might work?