hashicorp / terraform-provider-google

Terraform Provider for Google Cloud Platform
https://registry.terraform.io/providers/hashicorp/google/latest/docs
Mozilla Public License 2.0
2.25k stars 1.7k forks source link

Data Catalog - Add ability to link to existing entries (or obtain the entryId) via linkedResource #6766

Open Ryan-Lintern opened 3 years ago

Ryan-Lintern commented 3 years ago

Community Note

Description

When creating any BigQuery table or dataset on GCP, it is automatically assigned an entry, with a path of the form

projects/{project_id}/locations/{location}/entrygroups/{entryGroupId}/entries/{entryId}

Currently, the google_data_catalog_entry resource only allows for the creation of new entries, rather than linking to an existing entry. As such, any tags created are attached to the new entry, and not onto the existing entry/table. As such, please add the ability to obtain the entryId relating to the existing BigQuery linkedResource (https://cloud.google.com/data-catalog/docs/reference/rest/v1/entries/lookup)

New or Affected Resource(s)

Potential Terraform Configuration

References

tylerwengerd-cr commented 3 years ago

For anyone who needs this functionality now for BQ resources - the entryId field is a base64-encoded string generated from the resource id, minus any padding.

For example, projects/my-project/datasets/my-dataset/tables/my-table has the following base64 encoding:

> echo -n "projects/my-project/datasets/my-dataset/tables/my-table" | base64
cHJvamVjdHMvbXktcHJvamVjdC9kYXRhc2V0cy9teS1kYXRhc2V0L3RhYmxlcy9teS10YWJsZQ==

So that table in the US location results in the following entryId:

projects/my-project/locations/us/entryGroups/@bigquery/entries/cHJvamVjdHMvbXktcHJvamVjdC9kYXRhc2V0cy9teS1kYXRhc2V0L3RhYmxlcy9teS10YWJsZQ

My current workaround to generate the entryId based on the above info is as follows:

locals {
  table_location       = lower(google_bigquery_table.default.location)
  base64_encoded_id    = trim(base64encode(google_bigquery_table.default.id), "=")
  datacatalog_entry_id = "projects/${google_bigquery_table.default.project}/locations/${local.table_location}/entryGroups/@bigquery/entries/${local.base64_encoded_id}"
}

This is for a table but is basically the same for a dataset.

burnzy commented 3 years ago

That's great to know! Thanks for the info @tylerwengerd-cr

burnzy commented 3 years ago

@tylerwengerd-cr Any chance you know if this workaround also works for column-level resources? I can't seem to see what the proper resource id's are for columns ā€” only datasets and tables seem to be supported, but maybe I'm missing something? (I can't seem to find a full list of all GCP resource names/ids)

tylerwengerd-cr commented 3 years ago

@burnzy unfortunately no, I never got into column-level resources when I was working with this. Here is the best relevant documentation I can find on resource names (that you've probably already read šŸ˜„ ) so it might take some poking around with the API to see if it's even supported. Good luck!

27Bslash6 commented 3 years ago

Can anyone confirm this workaround still works?

When trying to attach to an existing dataset/entry I'm seeing:

Error: Error creating Entry: googleapi: Error 400: "projects/my-project/locations/eu/entryG..." is an invalid value for CreateEntryRequest.entry_id. It must contain only English letters, numbers and underscores; and be at most 64 characters.
AlexT-Ki commented 2 years ago

This is what I've done as a workaround (Requires gcloud cli installed):

data "external" "data_catalog_lookup" {
  program = ["gcloud", "data-catalog", "--format", "json(name)", "entries", "lookup", "bigquery.table.`${google_bigquery_table.table.project}`.${google_bigquery_table.table.dataset_id}.${google_bigquery_table.table.table_id}"]
}

The entry location can be retreived using data.external.data_catalog_lookup.result.name.