Closed mta614 closed 2 years ago
There is an example in the polyglot demo downloading GA with a Go library, that may help you out to see the structure. Since each step can run in its own environment you can mix and match.
You need a docker environment with googleAnalyticsR built with that library you could use. I would recommended creating a dedicated service account with only read only access to the GA account you want to download from, then upload that to Secret Manager. In the build step before your script download that service key file and point to it in the environment argument GA_AUTH. Then you can run your script using the key as you do locally.
Hmm this may be a naive question (I'm pretty new to Cloud), but my issue is that I'm not able to create a service account with read only access to GA (or any any kind of access, actually). It simply doesn't appear to be an option in Google Cloud when adjusting roles (unlike, say, setting a Storage Admin role which allows for writing/reading to/from GCS buckets).
You don't need to assign it any roles. You will be doing that in effect when you add it's email as a user to GA.
So I managed this (as you said, it was a matter of adding the service account to GA) and using Arben Kqiku's guide you recommended to get my pipeline running on Google Cloud.
One thing that I am currently doing that is terrible practice: I have my secret json in my Docker container, although I think I should be referencing them via Secret Manager. Or possible that they should be referenced in my script via Secret Manager AND accounted for when I make my Cloud build; I'm definitely unclear on exactly what's best to do.
No chance there's an example or guide to illustrate how to do this?
The polyglot use case make use of secret manager for the auth key that downloads it to the build workspace that is better but not perfect, ideal would be a call from R itself that I'll work on in the future.
I'm not totally sure if this is the right place for this question, but I'm in the process of taking an ETL I built in R and making it so it can run automatically on Google Cloud. So far, googleCloudRunner seems very promising for achieving this goal, but I have hit a bit of a snag:
My ETL needs to both use GCS via googleCloudStorageR and GA via googleAnalyticsR. Using the setup tutorial, granting the service account I created via cr_setup() was pretty trivial, although I am getting this minor issue:
Error: API returned: Cannot insert legacy ACL for an object when uniform bucket-level access is enabled. Read more at https://cloud.google.com/storage/docs/uniform-bucket-level-access
I think newer versions of the above packages on github may solve that problem but I haven't looked too deeply into it yet.
A much larger problem is attempting to query data in GA. Previously, I was using a method like this to set scopes:
which was working great. However, I'm a bit at a loss for how to do the same with the service account I've setup. As far as I can tell, there isn't a way to set scopes with functions. I also went into IAM to see if there was a role that mapped to https://www.googleapis.com/auth/analytics.readonly but I haven't been able to find anything.
I feel like I'm missing something fundamental here because I imagine querying data via GA is a super common use case. What am I missing?