astronomer / astronomer-cosmos

Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
https://astronomer.github.io/astronomer-cosmos/
Apache License 2.0
612 stars 154 forks source link

[Bug] Invalid aws session token #962

Closed ssulca closed 3 months ago

ssulca commented 4 months ago

Cosmos in AthenaAccessKeyProfileMapping always tries to use a session token, even when it is not provided by the Airflow connection

Context

When setting up a new aws connection using only aws_access_key_id and aws_secret_access_key, it generates a profile like this:

profile_name:
    outputs:
        athena:
            aws_access_key_id: KEY_ID
            aws_secret_access_key: '{{ env_var(''COSMOS_CONN_AWS_AWS_SECRET_ACCESS_KEY'')}}'
            aws_session_token: '{{ env_var(''COSMOS_CONN_AWS_AWS_SESSION_TOKEN'')}}'
            database: db_name
            ...
    target: athena

But AthenaAccessKeyProfileMapping sets the aws_session_token as an empty string, and dbt takes this value, resulting in.

'botocore.exceptions.ClientError: An error occurred (UnrecognizedClientException)
 when calling the GetDatabases operation: 
The security token included in the request is invalid', ''

Propose solutions In profile function

Add a condition that check if there is a token.

profile = {
    **self.mapped_params,
    **self.profile_args,
    "aws_access_key_id": self.temporary_credentials.access_key,
    "aws_secret_access_key": self.get_env_var_format("aws_secret_access_key"),
    "aws_session_token": self.get_env_var_format("aws_session_token") if self.temporary_credentials.token else None,
}
dosubot[bot] commented 4 months ago

To address the issue with the AthenaAccessKeyProfileMapping class in the astronomer-cosmos package where an invalid AWS session token error occurs due to the inclusion of an empty aws_session_token field, your proposed solution is on point. You can modify the class to conditionally include the aws_session_token in the profile only if the token is present. Here's a refined approach to implement this:

@property
def profile(self) -> dict[str, Any | None]:
    self.temporary_credentials = self._get_temporary_credentials()  # type: ignore
    profile = {
        **self.mapped_params,
        **self.profile_args,
        "aws_access_key_id": self.temporary_credentials.access_key,
        "aws_secret_access_key": self.get_env_var_format("aws_secret_access_key"),
    }

    if self.temporary_credentials.token:
        profile["aws_session_token"] = self.get_env_var_format("aws_session_token")

    return self.filter_null(profile)

This code ensures that aws_session_token is only added to the profile if it exists, thus avoiding the error you encountered. It's important to test this change thoroughly to ensure it doesn't affect other functionalities within the astronomer-cosmos package, especially those related to AWS authentication and dbt profile generation for Athena.

#### Details **Sources** - [cosmos/profiles/athena/access_key.py](https://github.com/astronomer/astronomer-cosmos/blob/main/cosmos/profiles/athena/access_key.py) - [cosmos/profiles/__init__.py](https://github.com/astronomer/astronomer-cosmos/blob/main/cosmos/profiles/__init__.py)
**About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

tatiana commented 4 months ago

HI @ssulca , thanks for reporting the issue and proposing a solution! Would you like to contribute a PR with this fix? It would be highly appreciated.