brucemcpherson / desktopliberation

hosting for desktop liberation google plus community
30 stars 1 forks source link

Workload identity with Kubernetes cronjobs to synch Mongo to Bigquery #161

Open brucemcpherson opened 7 months ago

brucemcpherson commented 7 months ago

here's the article- https://ramblings.mcpher.com/gcp/workload-identity-bigquery-mongo/

Kubernetes workload identity looks pretty scary when you read about it in the docs, but it really is a better (and simpler) way to give specific permissions to Kubernetes workloads than less secure methods such as using service account keys. I had a specific use case in mind – getting a set of collections from mongodb to bigquery on a regular schedule – and since I’m running Kube in that project anyway, it seemed a reasonable solution to use a Kube cronjob.

Maybe you’re not using kubernetes at all but just want to transfer data from mongo to bigquery – I’ll show you how to run those parts of the article locally too.

Even if that doesn’t match your exact end to end use case, there shoud be something here for anyone who wants to work with any of the topics mentioned in the (long) journey in this article covers.

Here’s a summary of the main topics:

Cloud build to create images to run on Kubernetes Cloud builder container images use google manaintained prebuilt images Artifact registry for serving your build images GCP service accounts versus Kubernetes service accounts Iam policy binding Kubernetes workload identity federation Kubernetes jobs and cronjobs bq for bigquery gsutil to move data to cloud storage yq and jq for manipulating yml and json files Mongoexport to get data out of mongo Doppler and Kubernetes Secrets to manage credentials