elementary-data / elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
https://www.elementary-data.com/
Apache License 2.0
1.94k stars 165 forks source link

Updated gcs client to use `client.bucket` instead of `client.get_bucket` method #1739

Closed jcarpenter12 closed 3 days ago

jcarpenter12 commented 2 weeks ago

This pull request has been added to update the client to use bucket method instead of the get_bucket method.

If the get_bucket method is used the account that makes the request must have storage.buckets.get access on the project itself which means that the account must have two roles applied to it in order for the IAM to work. This also means that the account must have more access than it technically needs to the project.

This stack overflow post outlines the issue

This was spotted when building and pushing elementary files through a CI pipeline. The get_bucket method can be used to check a bucket exists but as this is not something that is done within the elementary code, it doesn't make sense to use it. It should be on the user to make sure the bucket is available.

More details on this here

Switching to this method the account will only need storage.objectAdmin role on the bucket to write the files. Rather than having to use storage.objectAdmin and another role that has the storage.bucket.get permission.

I have not raised a bug for this as it is a one line change but happy to if required