OriHoch / dppctl

Serverless data pipelines and related infrastructure using Python and Kubernetes
MIT License
0 stars 0 forks source link

shadow Google's "Using Spark on Kubernetes Engine to Process Data in BigQuery" using dppctl #1

Open OriHoch opened 6 years ago

OriHoch commented 6 years ago

this guide shows how to implement an end to end pipeline that assesses GitHub data, to find projects that would benefit most from contribution.

it uses spark and bigquery - we should provide a simpler version which does the same using dppctl

OriHoch commented 6 years ago

image