cloudyr / googleCloudStorageR

Google Cloud Storage API to R
https://code.markedmondson.me/googleCloudStorageR
Other
104 stars 29 forks source link

authentication with shinyapps.io #168

Closed rafaxalv closed 2 years ago

rafaxalv commented 2 years ago

Hi,

I have an app hosted wih shinyapps.io. I would like to use it to update the data I have on my google storage bucket. I have a OAuth file with type 'computer'. Offline everything works great, here is what I have:

options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/devstorage.full_control") client_secret <- paste0(basedir_aux,'client_secret2.json' ) googleAuthR::gar_set_client(client_secret) googleCloudStorageR::gcs_auth(email = 'mail.com') gcs_global_bucket("my-bucket")

I have set my mail to the gcs_auth function to prevent prompts, but it seems this does not solve my problem. I get the following error: Could not authenticate via any gargle cred function

I traced and this error comes from this line:

googleCloudStorageR::gcs_auth(email = 'mail.com')

What should I do here?

MarkEdmondson1234 commented 2 years ago

The email parameter isn't appropriate here, and it's not a valid email address anyhow. You need to authenticate via a json service key that is uploaded with your app.

rafaxalv commented 2 years ago

Hi Mark,

Yes I figured that out. I used the service key now and I also had to allow all users to access the bucket.

service_secret <- paste0(basedir_aux,'service_secret.json' ) googleAuthR::gar_set_client(client_secret) options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/devstorage.full_control") googleCloudStorageR::gcs_auth(json_file = service_secret)

This seems to work! However now I notice the file update does not work always. Need to figure it out.

MarkEdmondson1234 commented 2 years ago

Great. You also shouldn't need to set the client as it is implied in the service key. Is file update related to listing gcs objects?

rafaxalv commented 2 years ago

I'm basically saving RData files.

I save the RData to disk and then pass the path to gcs_upload.

    path <- paste0(....,".RData")
    save(...,  file = path)
    gcs_upload(file = path, name =...)

Its throws no errors. However whenever I load the file again I dont see the updates. Sometimes the updates appear a few minutes later... Im not sure, maybe google is taking long to update the data in the bucket?

MarkEdmondson1234 commented 2 years ago

See also gcs_save() for your use case.

If the upload completes it should be consistent for download again, so not sure what's going on there. Save/load can take a while if it's big objects (GBs/TBs range)

You could try checking the object metadata to see what version it's on, and only download once it's incremented. But in my use it's always updated as soon as it's uploaded, it's not an eventually consistent data store, so I guess something else going on.

rafaxalv commented 2 years ago

I just tried with gcs_save(). No help.

I'm not sure what is happening, it seems everything uploads and loads as expected. I even added some sanity to always remove old rdata as soon as its updated in the bucket or after its loaded to envir. I also added a logic to check if the file exists before saving while(!exist) {sys.sleep}.

My files are very small, the largest one having 60KB. But it really seems like Im only able to load the changes about 5-10min later.

rafaxalv commented 2 years ago

Maybe its connected to this? https://stackoverflow.com/questions/62897641/google-cloud-storage-public-object-url-e-super-slow-updating or this https://stackoverflow.com/questions/41403673/google-cloud-storage-file-stuck-in-time-after-multiple-updates-deletions

Is there any place I can set the cache control "max-age=0"?

MarkEdmondson1234 commented 2 years ago

Cache is usually related to web resources, not related I think.

I suggest making a new issue with some reproducible code showing the problem, which will help narrow down what's happening.

rafaxalv commented 2 years ago

Sure, we can close the thread. One thing I fixed: The service credential provides a mail, we need to grant permission for this mail to read and write in the bucket. Once this is done I managed to bring the bucket to non-public status.