Open mimming opened 8 years ago
It looks like Dataflow for Python is still in alpha.
I'm going to wait until it's in beta before I move to it.
In the mean time, use this model to get batches:
Panoptes unit -> Cloud Storage bucket unit_sensors -> Python script on compute engine VM -> BigQuery
What we have now
sensor data is being dropped into Google Cloud Storage daily. A python script is slurping it up, and loading it into BigQuery
desired state
Create a Pub/Sub pipeline that accepts sensor data and follow this path: