Create a FERC Form 1 Super DB Schema

Currently the multi-annual FERC Form 1 DB that we build is based on a schema that's an awkward hybrid of the mappings between unreadable DBF filenames and actual table names that we found online, which pertains to the 2015 data, and a our parsing of the strings in the DBC file for whatever reference year we're trying to build a database for -- typically the most recent data release year.

This means that there are some kinds of data that we could be missing -- if tables and/or columns existed in the past but don't exist now, we'll probably miss them when reading in earlier years of data. This doesn't appear to be common, but it could trip someone up.

The "right way" to fix this seems like it's probably to create a super-schema that contains all of the tables that have ever existed and all of the columns that have ever existed, by parsing the DBC files for all of the data years... and crossing our fingers that the data types didn't change over the years, or there's one data type that can be imposed on each column which all the data can be forced into. We would also need to figure out the historical mappings between database table names and F1_whatevs.DBF filenames.

catalyst-cooperative / pudl

Create a FERC Form 1 Super DB Schema #288