open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
https://open-metadata.org
Apache License 2.0
5.49k stars 1.04k forks source link

BigQuery Multiple Project IDs not working with ADC authentication #18346

Open olof-nn opened 2 weeks ago

olof-nn commented 2 weeks ago

Is your feature request related to a problem? Please describe. We are running OpenMetadata on GCP GKE using WIF and ADC from the service account of the Pod. We want to ingest Metadata from BigQuery (amongst others) using MultipleProjectIds so that we can have cross-project Lineage.

This is currently only possible if you authenticate without ADC but instead downloading the service account key and providing that on the auth in OpenMetadata. This is considered bad practice by Google (and in general).

Describe the solution you'd like The reason that this is happening is because of this code:

@staticmethod
def set_project_id() -> List[str]:
    _, project_ids = auth.default()
    return project_ids if isinstance(project_ids, list) else [project_ids]

auth.default() will only return the project ID associated with the active ADC and not a list of project_ids.

The reason why this work when instead providing a service account key is because you manipulate it in this function:

"project_id": gcp_values.projectId.root

Thus the returned project ID will be a list returned if that was the user input.

A more proper solution might be for set_project_id() to list the projects IDs that the google credentials are entitled to access and match them with the user provided project IDs.

Describe alternatives you've considered Downloading service account key. Which we don't want to.

Additional context None

DovileKr commented 1 week ago

Bare Metal with Ingestion Docker Container 1.5,4. To confirm - only the first project is also used where GCP Credential Values and Credential Type is service_account. I am now manually changing first/second/third project in the ui to capture all under the same "service":

image

harshach commented 1 week ago

@ayush-shah check this out