Currently, stepfunctions retry policies only check the error name.
If the error name is vague and we need to check its cause, there is no way to do that.
Use Case
Below you can see an example with a sagemaker training job.
In this example, I want to retry only if there is a "ThrottlingException". I cannot do it because the retry policy only looks at the error name, in this case, "SageMaker.AmazonSageMakerException".
Proposed Solution
Improve retry and catch policies to also work with the error cause.
Currently, stepfunctions retry policies only check the error name. If the error name is vague and we need to check its cause, there is no way to do that.
Use Case
Below you can see an example with a sagemaker training job.
In this example, I want to retry only if there is a "ThrottlingException". I cannot do it because the retry policy only looks at the error name, in this case, "SageMaker.AmazonSageMakerException".
Proposed Solution
Improve retry and catch policies to also work with the error cause.
This is a :rocket: Feature Request