hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.28k stars 1.72k forks source link

Support BigQuery Omni external connection for AWS s3 #11459

Open ismailsimsek opened 2 years ago

ismailsimsek commented 2 years ago

Community Note

Description

As we can create BigQuery external connections with Cloud SQL for federated query, we should support that with AWS as well. similar request https://github.com/hashicorp/terraform-provider-google/issues/11053

New or Affected Resource(s)

Potential Terraform Configuration

resource "google_bigquery_connection" "connection" {
  provider      = google-beta
  connection_id = "connection_to_aws_s3"
  location      = "US"
  description   = "A BigQuery external connection for AWS "
  aws {
    AWS_ACCOUNT_ID = "the ID number of the connection's AWS IAM user."
    ROLE_NAME            = "the role policy name you chose."
    AWS_LOCATION      = "an AWS location in Google Cloud. Must be set to aws-us-east-1."
    CONNECTION_NAME= "the name you give this connection resource."
  }
}

References

voycey commented 2 years ago

Are there any updates on how this works in practice? With the release of BQ Omni and BigLake this is more important than ever in order to setup cross cloud external tables.

Specifically I can't see a way to define the connection role and the google_bigquery_connection without creating a cycle as it requires an iam role and that iam role requires the identity from google_bigquery_connection

Example:

resource "google_bigquery_connection" "connection" {
    provider      = google-beta
    connection_id = "bq-connection"
    location      = "aws-ap-southeast-1"
    friendly_name = "๐Ÿ‘‹"
    description   = "BQ Omni Connection"
    aws {
      access_role {
         iam_role_id =  aws_iam_role.bigquery-omni-connection-role.arn
      }
    }
}

resource "aws_iam_role" "bigquery-omni-connection-role" {
    name                 = "bigquery-omni-connection"
    max_session_duration = 43200

    assume_role_policy = <<-EOF
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Federated": "accounts.google.com"
          },
          "Action": "sts:AssumeRoleWithWebIdentity",
          "Condition": {
            "StringEquals": {
              "accounts.google.com:sub": ${google_bigquery_connection.connection.id}
            }
          }
        }
      ]
    }
    EOF
}
DrFaust92 commented 1 year ago

rileykarson i think this can be closed.

voycey i think you will need to extract the name and "manually" construct the iam role arn when passing to the connection resource.

chrisst commented 2 months ago

I can confirm, there's really no way around the cycle. Because Omni is using web identity under the hood to make the auth connection between the two clouds it requires a bi-directional configuration. This is by definition always going to be a cycle. Unfortunately the best you can do is use string building for 1 direction and resource reference in the other direction. Here's how I've modeled and e2e AWS Omni connection: https://gist.github.com/chrisst/314d131aa42db685938dee24dea0f912, setting up the references this way, and throwing a sleep in there for IAM propagation, allows terraform to succeed in a single apply.

voycey commented 2 months ago

Thanks @chrisst - if I ever get back to Omni then I will definitely give this a go :) I did get this applying if you look in https://github.com/hashicorp/terraform-provider-google/issues/12018 however I dont remember how far I got with it after!

chrisst commented 2 months ago

@voycey If you pick Omni back up let me know and I'm happy to help get it working. I think it's even picked up a few new features in the 2 years since you last tackle it ๐Ÿ˜