BigFlow is a Python framework for data processing pipelines on GCP.
The main features are:
Start from installing BigFlow on your local machine. Next, go through the BigFlow tutorial.
Prerequisites. Before you start, make sure you have the following software installed:
You can install the bigflow
package globally, but we recommend
installing it locally with venv
, in your project's folder:
python -m venv .bigflow_env
source .bigflow_env/bin/activate
Install the bigflow
PIP package:
pip install bigflow[bigquery,dataflow]
Test it:
bigflow -h
Read more about BigFlow CLI.
To interact with GCP you need to set a default project and log in:
gcloud config set project <your-gcp-project-id>
gcloud auth application-default login
Finally, check if your Docker is running:
docker info
You can ask questions on our gitter channel or stackoverflow.