Closed mradamcox closed 9 months ago
I went ahead and added and wrote up a python script -- make_tables.py
-- to automate the generation of the tables given the presence of a nicely formatted data dictionary. This does give rise to a few next steps. The first [and fastest] is that I need to double check that the output adheres to the Frictionless Data table-schema, but any modifications should be small. Beyond that, the data dictionaries themselves are not formatted in a way that translates nicely to the table schema, so if I have time I'll try to clean those up. Likely the biggest thing that needs done is the creation of a C_Dict.xlsx
from which the county level csv files can be made.
(I forgot to tag this issue in the commit, but the change can be found at 51d87c1)
Splendid. I have received a C_dict.xlsx from @JuHe0120 so I will add that this afternoon and we can begin the final steps of bringing all the data into the same directory, etc.
Now completed.
This is the schema that we'll use: https://specs.frictionlessdata.io/table-schema
The CSVs we build these JSON files for should be the latest OEPS release. The structure of these files is not 100% defined yet, but perhaps these JSON files could be made as a preparatory step to guide the conversion.
Update
We'll actually be creating "table definition" files, a construct I've made based on the frictionless data standard linked above, but with some extra properties that we need for loading data into BigQuery (including a path to the dataset itself). See this README for more info.