aws-cloudformation / aws-cloudformation-resource-providers-stepfunctions

The CloudFormation Resource Provider Package For AWS Step Functions
https://aws.amazon.com/step-functions/
Apache License 2.0
6 stars 4 forks source link

State machine definition substitution does not work for non-string entries. #14

Closed benkehoe closed 1 year ago

benkehoe commented 3 years ago

Cross-posting https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/issues/591

Substitutions are happening after JSON loading, which means all substitutions must be strings already.

I've been trying to move some AWS::StepFunctions::StateMachine definitions into external files (using DefinitionS3Location and DefinitionSubstitutions) but I've run into a hiccup. To start, the following resource is valid and deploys correctly:

StateMachine:
  Type: AWS::StepFunctions::StateMachine
  Properties:
    RoleArn: !Sub ${Role.Arn}
    DefinitionString: !Sub >
      {
        "StartAt": "DummyState",
        "TimeoutSeconds": ${TimeoutSeconds},
        "States": {
          "DummyState": {
            "Type": "Pass",
            "End": true
          }
        }
      }

Importantly, the TimeoutSeconds top-level field is required (by the Step Functions API) to be a JSON number. If I put quotes around ${TimeoutSeconds} the template will no longer deploy. That's fine as the behaviour without quotes works as I expect.

To move the DefinitionString into a separate file I first copy the value (minus the !Sub and indentation) into a file called definition.json:

{
  "StartAt": "DummyState",
  "TimeoutSeconds": ${TimeoutSeconds},
  "States": {
    "DummyState": {
      "Type": "Pass",
      "End": true
    }
  }
}

and change the resource like this:

StateMachine:
  Type: AWS::StepFunctions::StateMachine
  Properties:
    RoleArn: !Sub ${Role.Arn}
    DefinitionS3Location: definition.json
    DefinitionSubstitutions:
      TimeoutSeconds: !Ref TimeoutSeconds

When I try to deploy this template, however, I get an error from CloudFormation that says Invalid StateMachine definition file. It looks like CloudFormation is validating that the object at DefinitionS3Location contains valid JSON before doing substitution of values in DefinitionSubstitutions.

To work around that, I tried to make definition.json contain valid JSON by quoting ${TimeoutSeconds}:

{
  "StartAt": "DummyState",
  "TimeoutSeconds": "${TimeoutSeconds}",
  "States": {
    "DummyState": {
      "Type": "Pass",
      "End": true
    }
  }
}

Now when I try to deploy, though, I get an error from Step Functions that says _SCHEMA_VALIDATIONFAILED: Expected value of type Integer at /TimeoutSeconds. This is the same error I get if I try quoting ${TimeoutSeconds} in the working template's inline definition.

tl;dr - JSON validation of DefinitionS3Location happening before DefinitionSubstitutions substitutions makes it impossible to include non-string JSON values in external Step Function definitions.

jormello commented 3 years ago

Thanks for cross-posting this issue here. This is definitely something on our radar and we understand the friction that exists for customers looking to move their state machine definitions outside of their CloudFormation templates.

jabalsad commented 1 year ago

Is this still an issue? Does DefinitionSubstitutions not work when using DefinitionS3Location?

martinth commented 1 year ago

Is this still an issue? Does DefinitionSubstitutions not work when using DefinitionS3Location?

The issue here is not that DefinitionSubstitutions does not work. The issue is that you can not use a number (e.g. a timeout value) from DefinitionSubstitutions in the ASL file that defines the state machine since, to pass JSON validation, you have to put it in as "${TimeoutSeconds}" but that is then not the correct data type for the machine definition validation.

And yes, that issue still exists...I have just now run into it and came here to tell that it is kind of annoying 😞

DavidHospital commented 1 year ago

Any updates on this? Does anyone have a workaround?

wong-a commented 1 year ago

Should be fixed by https://github.com/aws-cloudformation/aws-cloudformation-resource-providers-stepfunctions/pull/34

wong-a commented 1 year ago

Confirming you can use numbers in DefinitionSubstitutions and DefinitionS3Location. Working example:

Template

AWSTemplateFormatVersion: '2010-09-09'
Parameters:
  S3Bucket:
    Type: String
  S3Key:
    Type: String
Resources:
  MyStateMachine:
    Type: AWS::StepFunctions::StateMachine
    Properties:
      DefinitionS3Location:
        Bucket: !Ref S3Bucket
        Key: !Ref S3Key
      DefinitionSubstitutions:
        TimeoutSeconds: 60
      RoleArn: !Sub arn:${AWS::Partition}:iam::${AWS::AccountId}:role/myIamRole

ASL definition JSON file at S3Bucket/S3Key

{
  "StartAt": "HelloWorld",
  "TimeoutSeconds": ${TimeoutSeconds},
  "States": {
    "HelloWorld": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "HelloWorld",
        "Payload": "${String}"
      },
      "End": true
    }
  }
}

Creates this state machine successfully:

{
  "StartAt": "HelloWorld",
  "TimeoutSeconds": 60,
  "States": {
    "HelloWorld": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "HelloWorld",
        "Payload": "${Payload}"
      },
      "End": true
    }
  }
}
wong-a commented 1 year ago

Closing this issue to prevent confusion. I've shared a working example of using an integer substitution with a JSON definition file in S3.

Per @martinth's comment, the JSON is technically invalid (as JSON's syntax highlighting shows) but it can be created in CloudFormation successfully. The resource handlers read the file from S3, applies the substitutions, then passes the definition string through to the Step Functions API.

https://github.com/aws-cloudformation/aws-cloudformation-resource-providers-stepfunctions/blob/main/statemachine/src/main/java/com/amazonaws/stepfunctions/cloudformation/statemachine/DefinitionProcessor.java#L81-L99

A workaround is to use YAML instead of JSON. The equivalent state machine definition in YAML is valid syntax:

StartAt: HelloWorld
TimeoutSeconds: ${TimeoutSeconds}
States:
  HelloWorld:
    Type: Task
    Resource: "arn:aws:states:::lambda:invoke"
    Parameters:
      FunctionName: HelloWorld
      Payload: ${Payload}
    Next: Complete
  Complete:
    Type: Succeed
paul-michalik commented 1 year ago

The behavior of state machine code substitutions is somewhat unexpected anyway. Consider the case, when an attribute of the DefinitionSubstitutions is a stringified (properly escaped) JSON. For example, let this be the state machine definition:

{
    "Comment": "NavVis data post-processing V2. Uses Elastic File System as shared storage, the job is not accessing Amazon S3 anymore.",
    "StartAt": "CreateStandardPayload",
    "States": {
        "CreateStandardPayload": {
            "Type": "Pass",
            "Parameters": {
                "ParS3TransferJobContainerOverrides": "${ParS3TransferJobContainerOverrides}"
            },
            "End": true
        }
    }
}

The goal is to pass a string which represents stringified JSON or JSON array. Let this be the relevant portion of the CloudFormation template:

      DefinitionSubstitutions:
        # This works:
        ParS3TransferJobContainerOverrides: "[1,2,3,4]"
        # This does not work: 'Resource handler returned message: "Invalid StateMachine definition." (RequestToken: 0078aa3e-ed83-6997-c6c0-972869b09610, HandlerErrorCode: InternalFailure)'
        ParS3TransferJobContainerOverrides: "{\"vcpus\": 2,\"memory\": 3868}",
        # This works (double escaping the JSON string) 
        ParS3TransferJobContainerOverrides: "{\\\"vcpus\\\": 2,\\\"memory\\\": 3868}",
        # This works (double escaping the JSON string) 
        ParS3TransferJobContainerOverrides: "[{\\\"vcpus\\\": 2,\\\"memory\\\": 3868}]"

Preferably, single escaped JSON should be supported.