hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.84k stars 9.19k forks source link

[Bug]: reading Lake Formation permissions: listing permissions: operation error LakeFormation: ListPermissions #38982

Open bharathkgit opened 2 months ago

bharathkgit commented 2 months ago

Terraform Core Version

greater than or equal to 1.2

AWS Provider Version

5.60

Affected Resource(s)

Expected Behavior

Terraform should apply the SELECT permissions and should mark the job as Success.

Actual Behavior

When applying SELECT permissions to table resource, permissions seem to be applied but the job will fail complaining about listing permissions error with access denied.

Relevant Error/Panic Output Snippet

Error: reading Lake Formation permissions: listing permissions: operation error LakeFormation: ListPermissions, https response error StatusCode: 400, RequestID: e54e492c-d46e-4f39-a041-099187b7e57f, api error AccessDeniedException: Resource does not exist or requester is not authorized to access requested permissions.

Terraform Configuration Files

resource "aws_lakeformation_permissions" "datalake_tables" {
    principal   = aws_iam_role.step_function_execution_role.arn
    permissions = ["SELECT"]
    for_each    = toset(local.datalake_tables)
    table {
      database_name = local.database_name
      name          = each.value
      catalog_id    = local.datalake_catalog_id
    }
 }

Steps to Reproduce

  1. Account A shared datalake tables with permissions SELECT to Account B.
  2. In Account B, assigned the SELECT permissions on tables in Account A for step function execution role arn.

But terraform applied the SELECT permissions but failing job with listing permissions error which was not asked as part of the code and the source Account A did not share listing permissions to Account B.

Debug Output

Plan for one of the table in the database :

# aws_lakeformation_permissions.connect_datalake_tables["table_name"] will be created
  + resource "aws_lakeformation_permissions" "connect_datalake_tables" {
      + catalog_resource              = false
      + id                            = (known after apply)
      + permissions                   = [
          + "SELECT",
        ]
      + permissions_with_grant_option = (known after apply)
      + principal                     = "arn:aws:iam::<AccountB#>:role/step_function_execution_role"

      + data_location (known after apply)

      + database (known after apply)

      + lf_tag (known after apply)

      + lf_tag_policy (known after apply)

      + table {
          + catalog_id    = <AccountA#>
          + database_name = "datalake_db"
          + name          = "table_name"
          + wildcard      = false
        }

      + table_with_columns (known after apply)
    }

Panic Output

Error: reading Lake Formation permissions: listing permissions: operation error LakeFormation: ListPermissions, https response error StatusCode: 400, RequestID: e54e492c-d46e-4f39-a041-099187b7e57f, api error AccessDeniedException: Resource does not exist or requester is not authorized to access requested permissions.

Important Factoids

No response

References

No response

Would you like to implement a fix?

No

github-actions[bot] commented 2 months ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

justinretzolk commented 2 months ago

Hey @bharathkgit πŸ‘‹ Thank you for taking the time to raise this! It looks like this error is coming from the upstream AWS API. Breaking the error down into its most relevant parts:

Error: reading Lake Formation permissions:
operation error LakeFormation: ListPermissions
StatusCode: 400, 
AccessDeniedException: Resource does not exist or requester is not authorized to access requested permissions.

The first line of this is from Terraform. When resources are created, a read operation is immediately performed so that Terraform can validate that the create operation had the intended effect, and that the configuration matches reality. This error indicates that Terraform encountered an error when performing this read operation. The operation is then listed (LakeFormation: ListPermissions), as well as the error code and message (400 and AccessDeniedException), which come from the upstream API.

With that in mind, if I've understood correctly, the resolution here should be to update the scope of the permissions for the credentials being used for provider authentication. Can you review and confirm?

bharathkgit commented 2 months ago

Hello @justinretzolk ,

Thank you very much for the response on this issue. You are absolutely rite, terraform is trying to perform "LakeFormation: ListPermissions" after granting the SELECT to the target role in the AWS account. But the Account was not granted with "LakeFormation:ListPermissions" on the source AWS Account A. Please note that this is cross Account permissions and the source account is limiting permissions to target accounts.

When we performed this step manually from AWS Console Account B, there is no error for "Listing Permissions" since there is no separate call to it and the "SELECT" perms were granted to target role, in this case step function execution role.

Indeed, the SELECT permissions were granted when applied from Terraform as well but failed with ListPermissions as the account does not have permissions.

The scope of the permissions can not increased as it is major impact across multiple accounts and regions. Because of this job failure, Terraform keep applying the same "SELECT" permissions again and again in each deployment even if the changes were not related to LakeFormation as the terraform state could not capture the applied "SELECT" permissions in the earlier deployment.

Why does Terraform need to invoke "LakeFormation:ListPermissions" for validating if the permissions were granted or not, isn't it possible to decide from the status of the AWS API call "SELECT" permissions ?

justinretzolk commented 2 months ago

Hey @bharathkgit,

The reason that Terraform invokes ListPermissions lies within Terraform's Provider Design Principles, which dictates that a resource should represent a single API component (in this case, the permissions configuration), so that it can handle the entire lifecycle of that resource -- from creation to destruction.

For aws_lakeformation_permissions, this would mean being able to create it (GrantPermissions), read it for any apply after the initial creation (ListPermissions) to verify that nothing has changed since the previous apply, and ultimately destroy it (RevokePermissions).

As an aside, the response from GrantPermissions doesn't include the information that would have been needed to validate consistency either. This is another part of the reason for the read after creation.

bharathkgit commented 2 months ago

Hey @justinretzolk ,

So the only solution is to include the listPermissions in the scope to grant it to the Target Accounts or is there any other workaround that we can apply to avoid this error and keeping track of that SELECT has been granted.

justinretzolk commented 2 months ago

@bharathkgit,

If I've understood the API documentation from AWS, that is correct. This resource requires that action in order to read the aws_lakeformation_permissions configuration and ensure consistency, and the linked document mentions:

Returns a list of the principal permissions on the resource, filtered by the permissions of the caller. For example, if you are granted an ALTER permission, you are able to see only the principal permissions for ALTER.

I do want to make sure I've been clear enough in my previous responses though. You'd be granting ListPermissions for the credentials provided to the AWS provider itself, not changing the configuration for the aws_lakeformation_permissions resource in any way. I'm fairly confident that's already come across, but given that we're also discussing a resource with "permissions" in the name, I figured it was worth being explicit πŸ˜….

bharathkgit commented 2 months ago

@justinretzolk , I worked with AWS Lake Formation team and the source account team to include the listPermissions or Describe perms and eventually it is not allowed due to some limitations on LakeFormation.

The design principles should not be applicable for cross account sharing access and the tables that are shared with some filter criteria. The current case is as similar, where the source account sharing a table with filter rows and filter columns. In such cases, there is no option for sharing describe/listing permissions in lakeFormation.

The verification logic should be either by relying on the GrantPermissions API status - HTTP 200 response or retrieve the permissions for the requested role and validate if the permissions was applied.

It is kind of dead lock situation, where the listPermisisons can not be granted to the target account in LF but Hashicorp/Terraform is just requesting listPermissions for validation.

justinretzolk commented 1 week ago

Thanks for following up here @bharathkgit! That's very helpful information. With that in mind, I'm going to leave this marked as a bug so that someone from the team or community can pick it up to investigate further.