apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.1k stars 14.3k forks source link

StepFunctionStartExecutionOperator in MWAA does not throw error if IAM Role does not DescribeExecution permission #41918

Closed sarkch closed 2 months ago

sarkch commented 2 months ago

Apache Airflow version

Other Airflow 2 version (please specify below)

If "Other Airflow 2 version" selected, which one?

2.8.1

What happened?

I am using StepFunctionStartExecutionOperator to execute a StateMachine.

StepFunctionStartExecutionOperator(
...
task_id="load_data",
deferrable=True,
waiter_delay=30,  # Poll for every 30 seconds
waiter_max_attempts=10,  # maximum number of attempts to poll for status
do_xcom_push=True,
)

if I look into the log of the task,

{{waiter_with_logging.py:129}} INFO - Status of step function execution is: {{waiter_with_logging.py:129}} INFO - Status of step function execution is: {{waiter_with_logging.py:129}} INFO - Status of step function execution is:

as you can see that airflow is not getting the current status (RUNNING, FAILED etc) of the StateMachine

What you think should happen instead?

Expected Output when the State Machine is RUNNING

[2024-08-28, 17:01:06 UTC] {{waiter_with_logging.py:129}} INFO - Status of step function execution is: RUNNING
[2024-08-28, 17:02:06 UTC] {{waiter_with_logging.py:129}} INFO - Status of step function execution is: RUNNING

How to reproduce

So in my case I found the root cause of this problem. The IAM Role associated with the Airflow did not have the below permission

'states:DescribeExecution'

On the StateMachine execution arn.

arn:aws:states:<Region>:<accountId>:execution:<stateMachineName>:*

Before granting the permission

[2024-08-28, 01:41:53 UTC] {{waiter_with_logging.py:129}} INFO - Status of step function execution is:

After Granting the permission

[2024-08-28, 17:01:06 UTC] {{waiter_with_logging.py:129}} INFO - Status of step function execution is: RUNNING

Operating System

Managed Airflow

Versions of Apache Airflow Providers

No response

Deployment

Amazon (AWS) MWAA

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

Code of Conduct

gopidesupavan commented 2 months ago

Thanks for opening this, Looks like valid issue, able to re produce, these kind of errors validation is missing in async_wait. Will push the changes for this.

The AccessDeniedException log is i have added to produce what exact error waiter throwing :)

cc: @eladkal

image