hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.63k stars 9.01k forks source link

aws_lakeformation_permissions (data_location) timeout #21539

Open simonB2020 opened 2 years ago

simonB2020 commented 2 years ago

Community Note

Terraform CLI and Terraform AWS Provider Version

Terraform: 0.14.6 AWS: 3.63.0

Affected Resource(s)

aws_lakeformation_permissions (with data_location argument)

Terraform Configuration Files

data "aws_lakeformation_permissions" "my_location_perms" {
    principal = data.aws_iam_role.glueservice.arn 
    data_location {
        arn = module.bucket_lakeraw.arn
    }
}

Debug Output

Error: error reading Lake Formation permissions: timeout while waiting for state to become 'AVAILABLE' (last state: 'NOT FOUND', timeout: 1m0s)

Panic Output

Expected Behavior

Permissions should be applied succesfully

Actual Behavior

As above, the action times out.

Steps to Reproduce

  1. Create a role
  2. Create a Bucket
  3. Grant the role lakeformation permissions to the bucket.
  4. Terraform Plan

Important Factoids

Executing role is set as a datalake admin

References

kasleet commented 2 years ago

Just happend to me.

After short debugging, it seems to be an issue in the filter logic (filter.go). I'm creating a table_with_columns permission here (could also be table or data_location, doesnt really matter)

resource "aws_lakeformation_permissions" "access" {
  principal = local.principal_arn
  permissions = ["SELECT"]

  table_with_columns {
    catalog_id    = local.catalog_id
    database_name = local.db_name
    name          = local.table_name
    wildcard      = true
  }

}

Debug output

2022-02-07T16:59:45.068+0100 [INFO]  provider.terraform-provider-aws_v3.74.0_x5: 2022/02/07 16:59:45 [DEBUG] Reading Lake Formation permissions: {
  Principal: {
    DataLakePrincipalIdentifier: "arn:aws:iam::***:role/***"
  },
  Resource: {
    Table: {
      CatalogId: "***",
      DatabaseName: "db_name",
      Name: "table_name"
    }
  }
}

Provider tries to do

  # aws_lakeformation_permissions.access will be created
  + resource "aws_lakeformation_permissions" "access" {
      + catalog_resource              = false
      + id                            = (known after apply)
      + permissions                   = [
          + "SELECT",
        ]
      + permissions_with_grant_option = (known after apply)
      + principal                     = "arn:aws:iam::***:role/***"

      + data_location {
          + arn        = (known after apply)
          + catalog_id = (known after apply)
        }

      + database {
          + catalog_id = (known after apply)
          + name       = (known after apply)
        }

      + table {
          + catalog_id    = (known after apply)
          + database_name = (known after apply)
          + name          = (known after apply)
          + wildcard      = (known after apply)
        }

      + table_with_columns {
          + catalog_id    = "***"
          + database_name = "db_name"
          + name          = "table_name"
          + wildcard      = true
        }
    }

Applying

/GrantPermission
...
{"Permissions":["SELECT"],"Principal":{"DataLakePrincipalIdentifier":"arn:aws:iam::***:role/***"},"Resource":{"TableWithColumns":{"CatalogId":"***","ColumnWildcard":{},"DatabaseName":"db_name","Name":"table_name"}}}

The ColumnWildcard object is empty, I would have expected something else.

After creating it, the provider tries to check if its there and gets following response

{
  "NextToken": null,
  "PrincipalResourcePermissions": [
    {
      "AdditionalDetails": null,
      "Permissions": [
        "SELECT"
      ],
      "PermissionsWithGrantOption": [],
      "Principal": {
        "DataLakePrincipalIdentifier": "arn:aws:iam::***:role/***"
      },
      "Resource": {
        "Catalog": null,
        "DataCellsFilter": null,
        "DataLocation": null,
        "Database": null,
        "LFTag": null,
        "LFTagPolicy": null,
        "Table": null,
        "TableWithColumns": {
          "CatalogId": "***",
          "ColumnNames": [
            "col1",
            "col2"
          ],
          "ColumnWildcard": null,
          "DatabaseName": "db_name",
          "Name": "table_name"
        }
      }
    }
  ]
}

So the provider receives the correct information back, but I think the post filtering is responsible on why the 'NOT FOUND' state is returned

// clean permissions = filter out permissions that do not pertain to this specific resource
cleanPermissions := FilterPermissions(input, tableType, columnNames, excludedColumnNames, columnWildcard, permissions)

if len(cleanPermissions) == 0 {
    return nil, statusNotFound, nil
}

return permissions, statusAvailable, nil

This will filter out the returned permission (looking at func FilterTableWithColumnsPermissions) because I did specify the wildcard in terraform, but the returned permission says that ColumnWildcard is set to null - as it was created using an empty object. Instead, it lists all columns in the ColumnNames array - which also results in filtering the permission. This all seems kind of strange.

I know from your comments that LakeFormation does a lot of implicit stuff (https://github.com/hashicorp/terraform-provider-aws/blob/187f1659a4fef11ac314567273b5470afe6b662f/internal/service/lakeformation/permissions.go#L258), but is there a timeline on when this issue will be addressed? Or is there a hack / workaround for this issue?

Thanks!

YakDriver commented 2 years ago

Unfortunately, I am unable to reproduce this problem. We have working tests with nearly the exact configurations you provided. Do you see how these might be different?

The s3 configuration from @simonB2020 above seems a lot like this test:

data "aws_partition" "current" {}

resource "aws_iam_role" "test" {
  name = "terraform-test"
  path = "/"

  assume_role_policy = jsonencode({
    Statement = [{
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        Service = "glue.${data.aws_partition.current.dns_suffix}"
      }
      }, {
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        Service = "s3.${data.aws_partition.current.dns_suffix}"
      }
    }]
    Version = "2012-10-17"
  })
}

resource "aws_s3_bucket" "test" {
  bucket        = "terraform-test"
  force_destroy = true
}

resource "aws_s3_bucket_acl" "test" {
  bucket = aws_s3_bucket.test.id
  acl    = "private"
}

resource "aws_lakeformation_resource" "test" {
  arn      = aws_s3_bucket.test.arn
  role_arn = aws_iam_role.test.arn
}

data "aws_caller_identity" "current" {}

data "aws_iam_session_context" "current" {
  arn = data.aws_caller_identity.current.arn
}

resource "aws_lakeformation_data_lake_settings" "test" {
  admins = [data.aws_iam_session_context.current.issuer_arn]
}

resource "aws_lakeformation_permissions" "test" {
  principal   = aws_iam_role.test.arn
  permissions = ["DATA_LOCATION_ACCESS"]

  data_location {
    arn = aws_s3_bucket.test.arn
  }

  # for consistency, ensure that admins are setup before testing
  depends_on = [aws_lakeformation_data_lake_settings.test]
}

The table with columns config from @kasleet above seems a lot like this test:

data "aws_partition" "current" {}

resource "aws_iam_role" "test" {
  name = "terraform-test"
  path = "/"

  assume_role_policy = jsonencode({
    Statement = [{
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        Service = "glue.${data.aws_partition.current.dns_suffix}"
      }
    }]
    Version = "2012-10-17"
  })
}

resource "aws_glue_catalog_database" "test" {
  name = "terraform-test"
}

resource "aws_glue_catalog_table" "test" {
  name          = "terraform-test"
  database_name = aws_glue_catalog_database.test.name
}

data "aws_caller_identity" "current" {}

data "aws_iam_session_context" "current" {
  arn = data.aws_caller_identity.current.arn
}

resource "aws_lakeformation_data_lake_settings" "test" {
  admins = [data.aws_iam_session_context.current.issuer_arn]
}

resource "aws_lakeformation_permissions" "test" {
  permissions = ["SELECT"]
  principal   = aws_iam_role.test.arn

  table_with_columns {
    database_name = aws_glue_catalog_table.test.database_name
    name          = aws_glue_catalog_table.test.name
    wildcard      = true
  }

  # for consistency, ensure that admins are setup before testing
  depends_on = [aws_lakeformation_data_lake_settings.test]
}
YakDriver commented 1 year ago

I haven't heard back on anything so I'm going close as complete. If this is not the case, please let us know!

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

ammar-nizami commented 1 year ago

We ran into this issue while granting cross-account Lake Formation permissions using terraform. The setup is as below:

  1. From account A, share a few selected columns (C1, C2, C3) in table (TBL1) with external account B.
  2. In account B, use wild card to share all columns of target table TBL1 with principal role (R1).

Steps 1 and 2 are performed in separate terraforms, and obviously on separate accounts. Step 2 terraform throws the below error │ Error: error reading Lake Formation permissions: timeout while waiting for state to become 'AVAILABLE' (last state: 'NOT FOUND', timeout: 1m0s)

kiyer1 commented 1 year ago

To add more context to the above, while it looks like a TIMEOUT issue which appears like a Race condition, However re-trying the terraform apply still causes the same issue.

Terraform version

terraform version
Terraform v1.0.1
on linux_amd64

Our provider versions:

Initializing provider plugins...

- Finding latest version of hashicorp/template...

- Finding hashicorp/aws versions matching ">= 3.36.0, < 4.0.0"...

- Installing hashicorp/template v2.2.0...

- Installed hashicorp/template v2.2.0 (signed by HashiCorp)

- Installing hashicorp/aws v3.75.2...

- Installed hashicorp/aws v3.75.2 (signed by HashiCorp)
kiyer1 commented 1 year ago

Also would be good to add support for custom timeouts: https://www.terraform.io/language/resources/syntax#operation-timeouts

yanniouamara commented 1 year ago

Issue with full templates raised here too #26602

wuetz commented 1 year ago

Hello everyone, is there someone working on this? I can still reproduce this with Provider version 5.5.0..

fidelove commented 9 months ago

Hi!!

I have experiences the same issue, and I completely agree with @kasleet analysis.

What I have identified in my case, is that this happens when the aws_lakeformation_permissions resource is created with a table_with_columns which include a column_names element. If it's configured with a wildcard it doesn't fail, but it doesn't make sense to use this resource when you're not filtering by column.

The logging messages are:

provider.terraform-provider-aws_v4.56.0_x5: HTTP Request Sent: http.request.body={"Permissions":["SELECT"],"Principal":{"DataLakePrincipalIdentifier":"valid_aws_role"},"Resource":{"TableWithColumns":{"ColumnNames":["column1","column2","column3","column4","column5"],"DatabaseName":"valid_aws_database","Name":"valid_aws_table"}}} http.request.header.authorization="AWS4-HMAC-SHA256 Credential=************/****/us-west-2/lakeformation/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date;x-amz-security-token, Signature=*****" http.request.header.content_type=application/json http.request.header.x_amz_date=valid_x_amz_date tf_rpc=ApplyResourceChange aws.service=LakeFormation http.flavor=1.1 http.url=https://lakeformation.us-west-2.amazonaws.com/GrantPermissions http.user_agent="APN/1.0 HashiCorp/1.0 Terraform/1.3.9 (+https://www.terraform.io/) terraform-provider-aws/4.56.0 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.44.208 (go1.19.3; linux; amd64)" net.peer.name=lakeformation.us-west-2.amazonaws.com tf_mux_provider=*schema.GRPCProviderServer aws.operation=GrantPermissions aws.region=us-west-2 aws.sdk=aws-sdk-go http.method=POST http.request.header.x_amz_security_token=***** http.request_content_length=389 @caller=github.com/hashicorp/aws-sdk-go-base/v2/awsv1shim/v2@v2.0.0-beta.25/logger.go:90 @module=aws tf_resource_type=aws_lakeformation_permissions tf_provider_addr=registry.terraform.io/hashicorp/aws tf_req_id=valid_tf_req_id timestamp=valid_timestamp

[DEBUG] provider.terraform-provider-aws_v4.56.0_x5: HTTP Response Received: http.response.header.content_type=application/json http.response.header.x_amzn_requestid=valid_amzn_req_id tf_mux_provider=*schema.GRPCProviderServer tf_provider_addr=registry.terraform.io/hashicorp/aws aws.operation=GrantPermissions http.duration=437 http.response.header.cache_control=no-cache http.status_code=200 aws.sdk=aws-sdk-go http.response.body={} http.response_content_length=2 @caller=github.com/hashicorp/aws-sdk-go-base/v2/awsv1shim/v2@v2.0.0-beta.25/logger.go:138 aws.region=us-west-2 http.response.header.date="valid_date" aws.service=LakeFormation tf_req_id=valid_tf_req_id tf_resource_type=aws_lakeformation_permissions tf_rpc=ApplyResourceChange @module=aws timestamp=valid_timestamp

[DEBUG] provider.terraform-provider-aws_v4.56.0_x5: [DEBUG] Waiting for state to become: [AVAILABLE]

[DEBUG] provider.terraform-provider-aws_v4.56.0_x5: HTTP Request Sent: http.request.header.authorization="AWS4-HMAC-SHA256 Credential=************/***/us-west-2/lakeformation/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date;x-amz-security-token, Signature=*****" net.peer.name=lakeformation.us-west-2.amazonaws.com tf_resource_type=aws_lakeformation_permissions aws.region=us-west-2 aws.service=LakeFormation aws.sdk=aws-sdk-go http.request.header.content_type=application/json http.url=https://lakeformation.us-west-2.amazonaws.com/ListPermissions http.user_agent="APN/1.0 HashiCorp/1.0 Terraform/1.3.9 (+https://www.terraform.io/) terraform-provider-aws/4.56.0 (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.44.208 (go1.19.3; linux; amd64)" @module=aws http.request.body={"Principal":{"DataLakePrincipalIdentifier":"valid_DataLakePrincipalIdentifier"},"Resource":{"Table":{"DatabaseName":"valid_DatabaseName","Name":"valid_tablename"}}} http.request.header.x_amz_security_token=***** http.request_content_length=268 tf_mux_provider=*schema.GRPCProviderServer tf_provider_addr=registry.terraform.io/hashicorp/aws @caller=github.com/hashicorp/aws-sdk-go-base/v2/awsv1shim/v2@v2.0.0-beta.25/logger.go:90 aws.operation=ListPermissions http.method=POST tf_req_id=valid_tf_req_id tf_rpc=ApplyResourceChange http.flavor=1.1 http.request.header.x_amz_date=valid_x_amz_date timestamp=valid_timestamp

[DEBUG] provider.terraform-provider-aws_v4.56.0_x5: HTTP Response Received: http.response.header.cache_control=no-cache http.status_code=200 tf_req_id=valid_tf_req_id aws.operation=ListPermissions aws.sdk=aws-sdk-go aws.service=LakeFormation http.duration=134 tf_provider_addr=registry.terraform.io/hashicorp/aws tf_rpc=ApplyResourceChange http.response.body={"NextToken":null,"PrincipalResourcePermissions":[{"AdditionalDetails":null,"LastUpdated":null,"LastUpdatedBy":null,"Permissions":["SELECT"],"PermissionsWithGrantOption":[],"Principal":{"DataLakePrincipalIdentifier":"valid_DataLakePrincipalIdentifier"},"Resource":{"Catalog":null,"DataCellsFilter":null,"DataLocation":null,"Database":null,"LFTag":null,"LFTagPolicy":null,"Table":null,"TableWithColumns":{"CatalogId":"valid_CatalogId","ColumnNames":["column1","column2","column3","column4","column5"],"ColumnWildcard":null,"DatabaseName":"valid_DatabaseName","Name":"valid_tablename"}}}]} http.response.header.content_type=application/json http.response_content_length=706 tf_mux_provider=*schema.GRPCProviderServer http.response.header.date="valid_date" tf_resource_type=aws_lakeformation_permissions @module=aws @caller=github.com/hashicorp/aws-sdk-go-base/v2/awsv1shim/v2@v2.0.0-beta.25/logger.go:138 aws.region=us-west-2 http.response.header.x_amzn_requestid=valid_x_amzn_requestid timestamp=valid_timestamp

aws_lakeformation_permissions.dp_datalake_permissions_for_env: Still creating...0s elapsed]

[WARN] provider.terraform-provider-aws_v4.56.0_x5: [WARN] WaitForState timeout after 1m0s

[WARN] provider.terraform-provider-aws_v4.56.0_x5: [WARN] WaitForState starting 30s refresh grace period

[ERROR] provider.terraform-provider-aws_v4.56.0_x5: Response contains error diagnostic: @module=sdk.proto diagnostic_detail= diagnostic_summary="reading Lake Formation permissions: timeout while waiting for state to become 'AVAILABLE' (last state: 'NOT FOUND', timeout: 1m0s)" tf_proto_version=5.3 tf_resource_type=aws_lakeformation_permissions @caller=github.com/hashicorp/terraform-plugin-go@v0.14.3/tfprotov5/internal/diag/diagnostics.go:55 tf_provider_addr=registry.terraform.io/hashicorp/aws tf_req_id=valid_tf_req_id tf_rpc=ApplyResourceChange diagnostic_severity=ERROR timestamp=valid_timestamp

[ERROR] vertex "aws_lakeformation_permissions.datalake_permissions_resource_name" error: reading Lake Formation permissions: timeout while waiting for state to become 'AVAILABLE' (last state: 'NOT FOUND', timeout: 1m0s)