Resurrecting this long-dead issue to finally put more explicit data validation tests in place. the following is a list of candidate tests:
[ ] test that county aggregates match the corresponding aggregates from the source tables for all of existing plants, proposed plants, and infrastructure
[X] capacity
[ ] CO2
[ ] facility counts
[ ] the most complex ETL transforms:
[ ] lbnl iso queue
[ ] nrel ordinances
[ ] test capacity and CO2 allocation across multi-location splits
[ ] test multi-location splitting is creating the correct number of entries
[ ] spot check geocoding of ambigous county names (eg "Houston" vs "Houston County", which does not contain the city!). This is mostly relevant to LBNL queues because they strip the word "county" from all names.
[ ] test that one calendar year of fuel data is being aggregated to create our CO2 estimates
Resurrecting this long-dead issue to finally put more explicit data validation tests in place. the following is a list of candidate tests: