Open chapmanjacobd opened 1 year ago
After reading through more code I think I get it now
https://github.com/GoogleCloudPlatform/covid-19-open-data/blob/e2f6c1c0840fa1dc301ed798f6a624781b453c19/src/pipelines/mobility/google_mobility.py https://github.com/GoogleCloudPlatform/covid-19-open-data/blob/15e2bdd4b1c7a523a74f42b3ada89f3686dbc882/src/pipelines/mobility/config.yaml
"Global_Mobility_Report.csv" is a source dataset which joins with other data, via knowledge_graph.csv, to create "mobility.csv" and "aggregates.csv"
Good day,
I'm trying to understand the context of place_id in various files. I know that
place_id
is just an identifier but I have encountered some puzzling things. Before I dive deep into my questions I will start light by asserting my beliefs about the data and how it is joined together. If there are incorrect beliefs please correct them:mobility.csv
are expected to be found inaggregated.csv
aggregated.csv
are expected to be found inmobility.csv
How does
mobility.csv
relate toGlobal_Mobility_Report.csv
?They seem to be talking about exactly the same thing...
But it seems like they are different data products entirely:
as well as with
aggregated.csv
: