apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.12k stars 14.3k forks source link

Vault Internal Client `aws_iam` auth type missing `role_id` and bad deprecation message in UI #42125

Open richard-iovanisci opened 2 months ago

richard-iovanisci commented 2 months ago

Apache Airflow Provider(s)

hashicorp

Versions of Apache Airflow Providers

3.7.1

Apache Airflow version

2.9.3

Operating System

Linux/UNIX

Deployment

Official Apache Airflow Helm Chart

Deployment details

EKS 1.28

What happened

With the above version of the provider, the role_id parameter is not correctly passed to the iam_login function of the hvac client when an IAM role is used to dynamically fetch temporary credentials. This causes a relative path not supported error as it ultimately causes a required parameter (role_id) to be missing from the login POST.

as seen here:

    def _auth_aws_iam(self, _client: hvac.Client) -> None:
        if self.key_id and self.secret_id:
            auth_args = {
                "access_key": self.key_id,
                "secret_key": self.secret_id,
                "role": self.role_id,
            }
        else:
            import boto3

            if self.assume_role_kwargs:
                sts_client = boto3.client("sts")
                credentials = sts_client.assume_role(**self.assume_role_kwargs)
                auth_args = {
                    "access_key": credentials["Credentials"]["AccessKeyId"],
                    "secret_key": credentials["Credentials"]["SecretAccessKey"],
                    "session_token": credentials["Credentials"]["SessionToken"],
                }
            else:
                session = boto3.Session()
                credentials = session.get_credentials()
                auth_args = {
                    "access_key": credentials.access_key,
                    "secret_key": credentials.secret_key,
                    "session_token": credentials.token,
                }

        if self.auth_mount_point:
            auth_args["mount_point"] = self.auth_mount_point

        _client.auth.aws.iam_login(**auth_args)

The role_id parameter makes it into the auth_args dict ONLY if a static key and secret access key are provided. Otherwise, temporary credentials are fetched using sts or get_credentials() and added to the auth_args dict, and later the mount_point is added, but role_id is ultimately missing.

This will always cause an issue when trying to auth to vault with 1aws_iam` using dynamic credentials since BOTH mount point and role id are required: see here if interested.

This was introduced by the following PR when support for this sort of dynamic credential usage was implemented (though probably never actually tested w/o a static key + access key): https://github.com/apache/airflow/pull/38536/files


Also, there is a bad message in the UI that the role id parameter for the vault connection is deprecated, which is only true for the approle auth method... it is REQUIRED for the aws_iam auth method. Since a deprecation warning will be thrown when this parameter is used for the approle auth method anyway, I suggect removing that text from the UI entirely.

All of these are very simple changes and I am willing to submit a PR... the fix has already been tested in a hotfix environment.

What you think should happen instead

There should be no relative path error thrown when dynamic credentials are used. The role_id parameter should be added to the auth_args dict and login should succeed.

How to reproduce

Try to instantiate a VaultHook or Vault Secrets Backend using aws_iam auth and do not provide static access credentials. If all of the config is correct, you will see a relative path error in the logs instead of successful auth to vault.

This requires both and airflow setup and a vault namespace configured with access provisioned through iam.

Anything else

This problem occurs every time. Again, we have the fix in out hotfix environment and are willing to submit the fix.

Are you willing to submit PR?

Code of Conduct

boring-cyborg[bot] commented 2 months ago

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

richard-iovanisci commented 2 months ago

PR opened for fix: https://github.com/apache/airflow/pull/42134

richard-iovanisci commented 2 months ago

Perhaps this example in the documentation for the VaultBackend should be updated to include role_id when building auth dict: https://airflow.apache.org/docs/apache-airflow-providers-hashicorp/stable/secrets-backends/hashicorp-vault.html#vault-authentication-with-aws-assume-role-sts