nicholasyager / dbt-loom

A dbt-core plugin to weave together multi-project dbt-core deployments
The Unlicense
104 stars 19 forks source link

support short lived credentials for GCP #33

Closed geoHeil closed 6 months ago

geoHeil commented 6 months ago

support log-in with short lived credentials i.e. if google cloud cli is already authenticated or workload identity is enabled

This is a gpt4 contribution

https://chat.openai.com/share/d2788537-909e-45a1-bd4e-1ac082dd0523

we are still testing the corner cases.

nicholasyager commented 6 months ago

Hi @geoHeil 👋🏻 I'm about 90% sure that this functionality already exists as defined.

Rationale:

So, I'm not quite sure what I'm missing here. Can you please provide more information about what drove this PR to be created? I have a suspicion that I may need to update docs and add more useful exceptions.

geoHeil commented 6 months ago

I might be overlooking something - but as far as I was aware:

The google.cloud.storage.Client.from_service_account_json method in the Google Cloud client library for Python is specifically designed to create a client object for interacting with Google Cloud Storage using credentials provided in a JSON file. This method does not inherently provide a fallback to short-lived credentials if the specified service account JSON file is not available or fails; it expects a valid path to a service account JSON file containing the credentials.

Normally short-lived credentials most of the time would be instantiated like:

from google.cloud import storage
from google.oauth2 import service_account

credentials = service_account.Credentials.from_service_account_info(credentials_dict)
client = storage.Client(project='project_id', credentials=credentials)

when running

>>> import inspect
>>> from google.cloud import storage
>>> print(inspect.getsource(storage.Client.from_service_account_json))
    @classmethod
    def from_service_account_json(cls, json_credentials_path, *args, **kwargs):
        """Factory to retrieve JSON credentials while creating client.

        :type json_credentials_path: str
        :param json_credentials_path: The path to a private key file (this file
                                      was given to you when you created the
                                      service account). This file must contain
                                      a JSON object with a private key and
                                      other credentials information (downloaded
                                      from the Google APIs console).

        :type args: tuple
        :param args: Remaining positional arguments to pass to constructor.

        :param kwargs: Remaining keyword arguments to pass to constructor.

        :rtype: :class:`_ClientFactoryMixin`
        :returns: The client created with the retrieved JSON credentials.
        :raises TypeError: if there is a conflict with the kwargs
                 and the credentials created by the factory.
        """
        with io.open(json_credentials_path, "r", encoding="utf-8") as json_fi:
            credentials_info = json.load(json_fi)

        return cls.from_service_account_info(credentials_info, *args, **kwargs)
nicholasyager commented 6 months ago

Thanks for providing this context, @geoHeil.

As of this evening, I've confirmed that the GCS client is performing authentication as expected:

I've also been able to confirm that the existing auth behavior is functionally identical to the new behavior under the hood in Google's libraries: both approaches deserialize JSON from a path without a fallback. That's specifically why dbt-loom has a built-in fallback, so If a credential is not provided, dbt-loom uses ADC. So, I'm at a little bit of a loss whether this PR has actually solved the issue you were looking to resolve 😅

Without a traceback or logs of an explicit failure, I'm inclined to think that GPT-4 is hallucinating a feature gap that may not exist.

geoHeil commented 6 months ago

I think then this is an issue of limited documentation - and let`s close the PR for now. It might be great if this feature could be more documented.