cloudyr / googleCloudStorageR

Google Cloud Storage API to R
https://code.markedmondson.me/googleCloudStorageR
Other
104 stars 29 forks source link

Authentication with Tidyverse default app instead of your own client.id #150

Closed abalter closed 2 years ago

abalter commented 3 years ago

I use bigrquery and can successfully authenticate with bq_auth(email="user@domain.com"). If it has been a while, or I run bq_deauth() then the process is that a browser page opens where I log into my google account.

When I try the same with gcs_auth(email="user@domain.com") the page that opens looks like this:

image

UPDATE I tried running directly in a Windows cmd terminal and this happens:

> library(googleCloudStorageR)
Set default bucket name to 'psjh-eacri'
> gcs_auth(email="ariel.balter@domain.org")
←[36mi←[39m ←[90m←[90m2021-08-02 20:06:49 >←[90m←[39m Setting client.id from GAR_CLIENT_JSON
Error: Can't find 'client_id' and 'client_secret' in the JSON
Run `rlang::last_error()` to see where the error occurred.
MarkEdmondson1234 commented 3 years ago

You need to set up a client id preferably via a client json file downloaded from your GCP project (see website docs on auth setup)

bigrquery is using its own supplied with the library I think. It may be possible to reuse that as they both use the same library to authenticate (gargle) but would need to make sure it is using cloud storage scopes.

abalter commented 3 years ago

I do not have permissions to create the json file given our company policy.

Is there a technical reason why it is not possible to use the same type of auth system that bigrquery does, or is it a philosophical decision/

MarkEdmondson1234 commented 3 years ago

Its more its a GCP project I don't administrate so it may not be its intended purpose. @jennybc may have an opinion on using the Tidyverse app outside of its intended use.

Any GCP project would do, you can make one with your own email outside of company policy if necessary.

But you can also use the token argument to reuse tokens and bigrquery does authenticate with scopes that work with Cloud Storage so this should work for you too:

library(bigrquery)
library(googleCloudStorageR)

# get an instance of the tidyverse app token
token <- bq_token(email="your@email.com")

# use tidyverse token for GCS
gcs_auth(token=token$auth_token)

# test it listing your GCS buckets
gcs_list_buckets("your-project-id")
abalter commented 3 years ago

I'm not sure you are on the right track (or that your name change is appropriate).

Consider this table

Package Purpose Authentication Function Authentication Methods Notes
cloudStorageR R Interface to Google Cloud Storage gcs_auth example: gcs_auth(email="a@b.c") OAuth2, Token, JSON, Web Authentication (authentication page opens) Web auth does NOT work
bigrquery R Interface to Google BigQuery bq_auth example: bq_auth(email="a@b.c") OAuth2, Token, JSON, Web Authentication (authentication page opens) Web auth works
MarkEdmondson1234 commented 3 years ago

Web auth works when you supply a client.id, if you haven't seen it already review the setup documentation.

If you run the code above what happens?

p.s googleCloudStorageR not cloudStorageR

abalter commented 3 years ago
> gcs_setup()
i ==Welcome to googleCloudStorageR v0.6.0 setup==
This wizard will scan your system for setup options and help you with any that are missing. 
Hit 0 or ESC to cancel. 

1: Create and download JSON service account key
2: Setup auto-authentication (JSON service account key)
3: Setup default bucket

Selection: 1
-----------------------------------------------------------------------------------------------
Do you want to configure for all R sessions or just this project? 

1: All R sessions (Recommended)
2: Project only

Selection: 1
-----------------------------------------------------------------------------------------------
i Could not find a OAuth 2.0 Client ID via GAR_CLIENT_JSON
Error in loadNamespace(x) : there is no package called ‘usethis’
> gcs_setup()
i ==Welcome to googleCloudStorageR v0.6.0 setup==
This wizard will scan your system for setup options and help you with any that are missing. 
Hit 0 or ESC to cancel. 

1: Create and download JSON service account key
2: Setup auto-authentication (JSON service account key)
3: Setup default bucket

Selection: 2
-----------------------------------------------------------------------------------------------
Do you want to configure for all R sessions or just this project? 

1: All R sessions (Recommended)
2: Project only

Selection: 1
-----------------------------------------------------------------------------------------------
x No environment argument detected: GCS_AUTH_FILE
i Could not find a OAuth 2.0 Client ID via GAR_CLIENT_JSON
Error in loadNamespace(x) : there is no package called ‘usethis’

After installing usethis:

> gcs_setup()
i ==Welcome to googleCloudStorageR v0.6.0 setup==
This wizard will scan your system for setup options and help you with any that are missing. 
Hit 0 or ESC to cancel. 

1: Create and download JSON service account key
2: Setup auto-authentication (JSON service account key)
3: Setup default bucket

Selection: 1
-----------------------------------------------------------------------------------------------
Do you want to configure for all R sessions or just this project? 

1: All R sessions (Recommended)
2: Project only

Selection: 1
-----------------------------------------------------------------------------------------------
i Could not find a OAuth 2.0 Client ID via GAR_CLIENT_JSON
Have you downloaded a Client ID file?

1: Yes
2: No

Selection: 2
! You must have a client ID file to proceed.
i Download via https://console.cloud.google.com/apis/credentials/oauthclient :
* Desktop app > Name > Create >
* OAuth 2.0 Client IDs >
* Click Download Arrow to the right >
* Download to your computer
-----------------------------------------------------------------------------------------------
i Rerun this wizard once you have your Client ID file
Open up service credentials URL?

1: No way
2: Yup
3: Nope

Selection: 3
x Need a clientId to be set before configuring further
[1] FALSE
> gcs_setup()
i ==Welcome to googleCloudStorageR v0.6.0 setup==
This wizard will scan your system for setup options and help you with any that are missing. 
Hit 0 or ESC to cancel. 

1: Create and download JSON service account key
2: Setup auto-authentication (JSON service account key)
3: Setup default bucket

Selection: 2
-----------------------------------------------------------------------------------------------
Do you want to configure for all R sessions or just this project? 

1: All R sessions (Recommended)
2: Project only

Selection: 1
-----------------------------------------------------------------------------------------------
x No environment argument detected: GCS_AUTH_FILE
i Could not find a OAuth 2.0 Client ID via GAR_CLIENT_JSON
Have you downloaded a Client ID file?

1: No
2: Yes

Selection: 1
! You must have a client ID file to proceed.
i Download via https://console.cloud.google.com/apis/credentials/oauthclient :
* Desktop app > Name > Create >
* OAuth 2.0 Client IDs >
* Click Download Arrow to the right >
* Download to your computer
-----------------------------------------------------------------------------------------------
i Rerun this wizard once you have your Client ID file
Open up service credentials URL?

1: I agree
2: No
3: Not now

Selection: 3
x Need a clientId to be set before configuring further
[1] FALSE

For comparison:

> library(bigrquery)
> bq_deauth()
> bq_auth()
Is it OK to cache OAuth access credentials in the folder
C:/Users/ariel/AppData/Local/gargle/gargle/Cache between R sessions?

1: Yes
2: No

Selection: 1
Waiting for authentication in browser...
Press Esc/Ctrl + C to abort

image

abalter commented 3 years ago

Oh, I see now. Is that what you were talking about that Tidyverse has a google authentication API? Sounds like this package doesn't use it.

So maybe this is actually a feature request?

MarkEdmondson1234 commented 3 years ago

The issue is being confused here :)

Traditionally you use your own client id with your own GCP project. That is described on the website and you are having issues as above.

But the original issue opened was to use tidyverse's pre-created client id that comes with the bigrquery package. That will give you the same way to authenticate with the same token used for both bigrquery and googleCloudStorageR.

To do that, run the code I put in the reply above, which takes the bigrquery token and puts it into the cloud storage auth environment as well. Eg this code:

library(bigrquery)
library(googleCloudStorageR)

# get an instance of the tidyverse app token
token <- bq_token(email="your@email.com")

# use tidyverse token for GCS
gcs_auth(token=token$auth_token)

# test it listing your GCS buckets
gcs_list_buckets("your-project-id")
jennybc commented 3 years ago

The appropriate use of the tidyverse OAuth client is described in our privacy policy, which is also linked whenever one does auth in the browser via this client:

https://www.tidyverse.org/google_privacy_policy/

so, yes, it is only meant to be used directly by bigrquery. What @MarkEdmondson1234 suggests above re: using a token you obtained for bigrquery with googleCloudStorageR looks OK and it's certainly not anything we'd be able to regulate anyway.

I do not have permissions to create the json file given our company policy.

👆 is pretty surprising to me and seems the exact opposite of most IT policies I've ever encountered, so I wonder if maybe there was some misunderstanding along the way? Usually a company is uncomfortable with an employee authorizing an third party OAuth client to act on their behalf and, in fact, insists on using a client created within a company-controlled GCP project. It might be worth revisiting this point, because it does seem like the most logical thing is for @abalter to get their own OAuth client.

MarkEdmondson1234 commented 3 years ago

Thanks Jenny - and perhaps an unanswered question here is why I do not include a client.id with the package you can use similar to bigrquery. I do do this for other packages such as searchConsoleR and googleAnalyticsR to make the auth process a bit easier, but for APIs that potentially cost you money and due to the extra certification requirements by Google for those APIs I don't have the resources to include a client.id for APIs such as Cloud Storage. I also think its better you use your own anyway, and since you must have a GCP project for your bucket, not a big step. I agree as a company I would prefer users to make their own client.id rather than go through another companies.

jennybc commented 3 years ago

Yeah, a client that other people can use is very difficult to obtain and is a state I regard as permanently tenuous, i.e. it could simply become untenable in the long-run. So I understand why you don't do so here and agree that it's natural to expect the user to obtain a client id in this context.