healthyregions / oeps

Opioid Environment Policy Scan - data explorer and backend management
https://oeps.healthyregions.org
GNU General Public License v3.0
0 stars 0 forks source link

Create Table Definitions for all OEPS CSVs #4

Closed mradamcox closed 9 months ago

mradamcox commented 1 year ago

This is the schema that we'll use: https://specs.frictionlessdata.io/table-schema

The CSVs we build these JSON files for should be the latest OEPS release. The structure of these files is not 100% defined yet, but perhaps these JSON files could be made as a preparatory step to guide the conversion.

Update

We'll actually be creating "table definition" files, a construct I've made based on the frictionless data standard linked above, but with some extra properties that we need for loading data into BigQuery (including a path to the dataset itself). See this README for more info.

bucketteOfIvy commented 10 months ago

I went ahead and added and wrote up a python script -- make_tables.py -- to automate the generation of the tables given the presence of a nicely formatted data dictionary. This does give rise to a few next steps. The first [and fastest] is that I need to double check that the output adheres to the Frictionless Data table-schema, but any modifications should be small. Beyond that, the data dictionaries themselves are not formatted in a way that translates nicely to the table schema, so if I have time I'll try to clean those up. Likely the biggest thing that needs done is the creation of a C_Dict.xlsx from which the county level csv files can be made.

(I forgot to tag this issue in the commit, but the change can be found at 51d87c1)

mradamcox commented 10 months ago

Splendid. I have received a C_dict.xlsx from @JuHe0120 so I will add that this afternoon and we can begin the final steps of bringing all the data into the same directory, etc.

mradamcox commented 9 months ago

Now completed.