Open mb-dev opened 4 years ago
since RunEcsEc2Task
and RunEcsFargateTask
are going to be marked as @deprecated
, this feature should be introduced in RunTask
@mb-dev PR incoming once I merge in #8451 to make launchType
optional. Step Functions does not support specifying capacityProviderStrategy
as a parameter.
As per documentation you've referenced
If a capacityProviderStrategy is specified, the launchType parameter must be omitted. If no capacityProviderStrategy or launchType is specified, the defaultCapacityProviderStrategy for the cluster is used.
Although this unlocks some usage, it's still incomplete until users can supply a capacity provider strategy. Since we're limited I'm inclined to agree with your suggested stop-gap. It will still be compatible when capacityProviderStrategy
can be supplied.
edit: added following question:
What happens if a cluster does not have a default capacity provider strategy? I believe that's optional too?
per @mb-dev in comment
@shivlaks what alarmed me is that the PR makes the definition depend more on launch type than before. launchTarget sounds like launchType, and it's tied to either EC2/Fargate. I am not sure if it will be clear how to express a undefined launchType while still specifying parameters to be used at creation.
Currently no launchType might look like:
const runTask = new tasks.EcsRunTask(stack, 'Run', { integrationPattern: sfn.IntegrationPattern.RUN_JOB, cluster, taskDefinition, launchTarget: new tasks.EcsEc2LaunchTarget({ launchType: undefined, }), });
If launchTarget would be named something like clusterConfiguration it might be easier to parse.
@mb-dev launchType
was never really exposed as a property (with the old classes or with the new implementation)
The thinking was to associate launchTarget
closely to launchType
.
I was thinking of introducing something like IEcsCapacityProvider
to add the capability to specify capacity providers. Users would specify either capacity providers or a launch target (which would become optional).
const runTask = new tasks.EcsRunTask(stack, 'Run', {
integrationPattern: sfn.IntegrationPattern.RUN_JOB,
cluster,
taskDefinition,
capacityProviders: [new tasks.EcsEc2CapacityProvider({
...
})],
});
and adding types for EcsFargateCapacityProvider
and EcsEc2CapacityProvider
classes to help populate that list of capacity providers.
what do you think?
@shivlaks thanks for moving the discussion here. The capacity providers are a property of ECS cluster, while the task level property is choosing one of the providers specified in the cluster definition. Given that users might create ECS cluster or capacity providers outside of CDK, and that there's an option for default capacity provider (as this stop gap suggests)
there are actually 3 scenarios for using runTask in step functions: 1) users specifying launchType 2) users specifying capacityProvider 3) users specifying neither capacity provider or launchtype - in this case the default capacity provider is used. See: https://docs.aws.amazon.com/AmazonECS/latest/userguide/cluster-capacity-providers.html
Default capacity provider strategy
A default capacity provider strategy is associated with each Amazon ECS cluster. This determines the capacity provider strategy the cluster will use if no other capacity provider strategy or launch type is specified when running a task or creating a service.
One idea is to expose DEFAULT_CAPACITY_PROVIDER
constant that can be assigned to launchType
or launchTarget
that will make it clear we want default capacity provider
Capacity Providers were added to Cloudformation: https://github.com/aws/aws-cdk/commit/4ce27f4195c70bd9e365ec0e0df5c0ede863bc8a#diff-4a4e15a081904ee16b4d84e2f5cf5aee
So now it's even more important to be able to omit launchType.
@mb-dev
The recommended workaround for coverage that isn't quite supported in the aws-stepfunctions-tasks
module is to supply the needed ASL as a custom state. You can create your task, and supply the .toStateJson()
as input (you'd need to adjust for any parameters you want to add/remove/modify)
We're still discussing options, but are not convinced that the stop-gap needs to be introduced directly into the task or folded into the EcsEc2LaunchTarget
/ EcsFargateLaunchTarget
classes yet.
launchtarget
as a whole should likely be something we can omit. Can you help me better understand the EC2 / Fargate specific parameters that would need to be supplied (with the omission of launch target). I don't mind the constant idea but will want to better understand what the ASL looks like for these use cases.
Ooo, i totally missed the escape hatch in the changelog. I will try that tomorrow and maybe that will be enough until full capacity providers support is added.
My pre-refactor EC2 integration is straightforward (in Python):
sft.RunEcsEc2Task(
cluster=cluster,
task_definition=task_definition,
integration_pattern=sfn.ServiceIntegrationPattern.SYNC,
container_overrides={...},
)
So I personally won't need launchTarget
, just not launchType
.
@mb-dev let me know if I can help with checking out the implementation of the custom state or if it would help for me to whip up an example!
Hello!
I have been using the CustomState
to avoid this problem but now I am facing the problem of losing ResultPath
during the "translation" (cf. this issue #8754).
Not really sure there is an easy way to avoid the problem (I can wrap my step in a parallel step) but annoying...
is there any progress here? what does the roadmap look like? One year later, I still can't get the task to run on the correct capacity provider.
is there any progress here? what does the roadmap look like? One year later, I still can't get the task to run on the correct capacity provider.
You can use FARGATE_SPOT with Step Functions, as shown in my tests. You just need to omit the LaunchType
field in your state machine definition.
Remove the field "LaunchType": "FARGATE"
from the parameters.
{
"Comment": "State machine integrated with ECS",
"StartAt": "Initial",
"States": {
"Initial": {
"Type": "Task",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"Cluster": "${ECS_CLUSTER_ARN}",
"PlatformVersion": "LATEST",
"TaskDefinition": "${ECS_TASK_DEFINITION_ARN}",
"NetworkConfiguration": {
"AwsvpcConfiguration": {
"Subnets": ${SUBNETS},
"AssignPublicIp": "ENABLED"
}
},
"Overrides": {
"ContainerOverrides": ${ECS_CONTAINER_OVERRIDES}
}
},
"TimeoutSeconds": 3600,
"End": true
}
}
}
Just bumped into this as well. My use case is spawning long-running tasks on (expensive) GPU instances that are scaled to zero most of the time. When the StepFunction workflow is at EcsRunTask
, the expectation is that the cluster spins up the GPU instance to do the work. But that only works if the default capacity provider is used; currently, the execution always fails with
No Container Instances were found in your cluster. (Service: AmazonECS; Status Code: 400; Error Code: InvalidParameterException; Request ID: 5652eb66-58cf-4720-a4d5-40bedf7ddf22; Proxy: null)
I created a class which implements IEcsLaunchTarget so this is a drop-in workaround:
export class EcsFargateSpotLaunchTarget implements sfnTasks.IEcsLaunchTarget {
/**
* Launch the ECS task using Fargate Spot.
*
* This functionality is not built into CDK yet.
*/
constructor(private readonly cluster: ecs.Cluster, private readonly options?: sfnTasks.EcsFargateLaunchTargetOptions) { }
/**
* Called when the Fargate launch type configured on RunTask
*/
public bind(_task: sfnTasks.EcsRunTask, launchTargetOptions: sfnTasks.LaunchTargetBindOptions): sfnTasks.EcsLaunchTargetConfig {
this.cluster.enableFargateCapacityProviders();
if (!launchTargetOptions.taskDefinition.isFargateCompatible) {
throw new Error('Supplied TaskDefinition is not compatible with Fargate');
}
return {
parameters: {
PlatformVersion: this.options?.platformVersion,
"CapacityProviderStrategy": [
{
"CapacityProvider": "FARGATE_SPOT",
"Weight": 1
}
],
},
};
}
}
const task = new sfnTasks.EcsRunTask(this, 'RunTask', {
...
launchTarget: new EcsFargateSpotLaunchTarget(ecsCluster)
});
I like @danw-mpl solution. I went for a slightly different approach and removed the LaunchType property entirely, allowing the ECS task to take the default capacity provider settings from the cluster.
C# example:
public class EcsRunTaskWithoutLaunchType : EcsRunTask
{
public EcsRunTaskWithoutLaunchType(Construct scope, string id, IEcsRunTaskProps props) : base(scope, id, props)
{
}
protected EcsRunTaskWithoutLaunchType(ByRefValue reference) : base(reference)
{
}
protected EcsRunTaskWithoutLaunchType(DeputyProps props) : base(props)
{
}
public override JObject ToStateJson()
{
var stateJson = base.ToStateJson();
((JObject)stateJson["Parameters"]!).Remove("LaunchType");
return stateJson;
}
}
@Muppets I tried that initially and ran into an error from Step Functions - something like 'no container instances found in your cluster'.
I'd assume that's a problem with my cluster config though. Even though I had fargate capacity providers enabled.
I've also run into this issue. Here is an implementation for Python that can be used, and is compatible with clusters that contain a default capacity provider and those that do not:
class EcsRunTakWithoutLaunchType(EcsRunTask):
"""Workaround for https://github.com/aws/aws-cdk/issues/7967"""
def __init__(self, *args, **kwargs):
self.cluster = kwargs.get('cluster')
if not self.cluster:
raise ValueError('cluster is required')
super().__init__(*args, **kwargs)
def to_state_json(self) -> typing.Mapping[typing.Any, typing.Any]:
# if the cluster has a default capacity provider, we pop the LaunchType parameter
# to use it by default (rather than attempting to use EC2 directly)
state_json = super().to_state_json()
if self.cluster.default_capacity_provider_strategy:
state_json['Parameters'].pop('LaunchType')
return state_json
return state_json
The advantage here is that we can keep the EcsEc2LaunchTarget if the cluster has no default capacity provider defined.
capacity providers allow using ECS with dynamic capacity. https://aws.amazon.com/tw/about-aws/whats-new/2019/12/amazon-ecs-capacity-providers-now-available/ , yet CloudFormation and CDK do not yet support this feature https://github.com/aws/aws-cdk/issues/5471.
I propose as a stop gap solution to allow users to submit EC2 tasks without
launchType
: https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_RunTask.html#ECS-RunTask-request-launchTypeif
launchType
is provided, there is no way to get the task to use the capacity providers that are created outside CDK.Use Case
I want to create capacity providers using boto3 and use them when launching EC2 tasks using step functions.
Proposed Solution
Provider a way to make
launchType
optional here: https://github.com/aws/aws-cdk/blob/master/packages/%40aws-cdk/aws-stepfunctions-tasks/lib/ecs/run-ecs-ec2-task.ts#L62Or to change it after construction.
Other
This is a :rocket: Feature Request