aws / aws-sdk

Landing page for the AWS SDKs on GitHub
https://aws.amazon.com/tools/
Other
68 stars 13 forks source link

EMR keepJobFlowAliveWhenNoSteps incorrectly defaults to false #605

Closed gkiel closed 9 months ago

gkiel commented 10 months ago

Describe the bug

When creating an EMR cluster via runJobFlowRequest and providing an auto termination policy and not specifying any value for keepJobFlowAliveWhenNoSteps in the JobFlowInstancesConfig, the cluster will terminate once all steps have completed rather than waiting for the time specified in the auto termination policy.

Expected Behavior

I expected the EMR cluster to respect the auto termination policy by default since the documentation says that the default for keepJobFlowAliveWhenNoSteps is true: [https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/emr/model/JobFlowInstancesConfig.html#keepJobFlowAliveWhenNoSteps()](https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/emr/model/JobFlowInstancesConfig.html#keepJobFlowAliveWhenNoSteps())

Current Behavior

The EMR cluster shut down once all steps completed

Reproduction Steps

EmrClient emrClient = EmrClient.builder().region(Region.US_WEST_2).build();
emrClient.runJobFlow(RunJobFlowRequest.builder()
    .autoTerminationPolicy(AutoTerminationPolicy.builder()
        .idleTimeout(15L * 60)
        .build())
    .instances(JobFlowInstancesConfig.builder()
        // ec2KeyName, ec2SubnetId, emrManagedMasterSecurityGroup, emrManagedSlaveSecurityGroup, serviceAccessSecurityGroup omitted
        .instanceGroups(List.of(
            InstanceGroupConfig.builder()
                .market(MarketType.ON_DEMAND)
                .instanceRole(InstanceRoleType.MASTER)
                .instanceType("r6g.2xlarge")
                .instanceCount(1)
                .build(),
            InstanceGroupConfig.builder()
                .market(MarketType.ON_DEMAND)
                .instanceRole(InstanceRoleType.CORE)
                .instanceType("r6g.8xlarge")
                .instanceCount(1)
                .build())))
    .jobFlowRole("EMR_EC2_DefaultRole")
    .name("Some name")
    .releaseLabel("emr-6.8.0")
    .serviceRole("EMR_DefaultRole")
    .build())

Possible Solution

No response

Additional Information/Context

Adding .keepJobFlowAliveWhenNoSteps(true) to the JobFlowInstancesConfig in the above snippet results in the EMR cluster respecting the auto termination policy

AWS Java SDK version used

2.20.140

JDK version used

OpenJDK Runtime Environment Temurin-17.0.4.1+1 (build 17.0.4.1+1)

Operating System and version

MacOS Ventura 13.5

debora-ito commented 9 months ago

@gkiel this looks like an issue on the EMR API. I'll reach out to the EMR team and ask for clarification.

In case they ask for a requestID for analysis, can you provide a requestID of a runJobFlow call?

debora-ito commented 9 months ago

@gkiel The EMR team replied saying the default for keepJobFlowAliveWhenNoSteps is not true, the documentation is incorrect. They'll fix the documentation.

gkiel commented 9 months ago

Sounds good, thanks for the update!

debora-ito commented 9 months ago

I'll move this to the cross-SDK repository to increase visibility to the other SDKs.

I'll also mark this to auto close soon, since there's no action pending from the SDK team, but let us know if you have any other question.

P99705632