GoogleCloudPlatform / professional-services-data-validator

Utility to compare data between homogeneous or heterogeneous environments to ensure source and target tables match
Apache License 2.0
399 stars 114 forks source link

Error while using Secret Manager for GCP Bigquery Connection. #1159

Closed moizarafat closed 4 months ago

moizarafat commented 4 months ago

Hello,

I defined a BQ Connection as below:

BQ_CONN = {"source_type": "BigQuery", "project_id": <project_id>, "secret_id": "<id>", "secret_manager_type": "gcp", "secret_manager_project_id": <project_id>}

getting an error as below

data_validation.exceptions.DataClientConnectionFailure: Connection Type "BigQuery" could not connect: get_bigquery_client() got an unexpected keyword argument 'secret_id'

Please advise on how to pass the secret_id to look for the credentials for the job.

Thanks

nehanene15 commented 4 months ago

Here is the documentation.

You don't need to specify 'secret_id'. By specifying the secret_manager_type and secret_manager_project_id, DVT will grab the secret for each of the remaining flags by the name provided.

For example, given this connection:

BQ_CONN = {"source_type": "BigQuery", "project_id": "dvt-project", "secret_manager_type": "gcp", "secret_manager_project_id": "my-project"}

DVT will search in "my-project" for a secret named "dvt-project" and will use that value as the project_id. Let me know if you have any questions.

moizarafat commented 4 months ago

Hi @nehanene15 Thanks for the reply. I tried your recommendation but it gave me error that its unable to submit bigquery jobs. The account whose credentials i included in the json file definitely has the BQ permissions so I believe it did not take my credentials from secret manager but instead use the service account for composer/airflow. Based on this logic https://github.com/GoogleCloudPlatform/professional-services-data-validator/blob/e3fe3d184ddb4ffecddf83f5a87e681d787ea9db/data_validation/clients.py#L226 should i be passing this configuration

BQ_CONN = {"source_type": "BigQuery", "project_id": "dvt-project", "secret_manager_type": "gcp", "secret_manager_project_id": "my-project", "credentials": "<credential name>"} ?

nehanene15 commented 4 months ago

The same logic applies to the credentials. The BQ connection flags are listed here.

BQ_CONN = {"source_type": "BigQuery", "project_id": "dvt-project", "secret_manager_type": "gcp", "secret_manager_project_id": "my-project", "google_service_account_key_path": "dvt-sa-key-path"}

For the above to work, you should have two secrets in secret manager within project 'my-project':

moizarafat commented 4 months ago

SA keypath is not an option for us. How can i get the secret manager to work with BQ? Like i mentioned it gave me an error which means it did not pass the credentials from secret manager to BQ connection.

Thanks

moizarafat commented 4 months ago

Solved by passing credentials as recommended in https://github.com/GoogleCloudPlatform/professional-services-data-validator/issues/768

Thanks