catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 106 forks source link

Create more complete / diverse fast tests #203

Closed zaneselvans closed 4 years ago

zaneselvans commented 5 years ago

There's a bunch of code which is part of the MCOE calculations (heat rate, fuel cost, etc.) that isn't currently being tested in the Travis CI tests. We all clearly seem to prefer using the quicker tests most of the time, and the other continuous integration tools (like test coverage) depend on the tests that are being run by Travis, so it seems like it is probably worthwhile setting up a fuller version of the fast testing.

We might be able to do this by specifying the test environment using an input file, as we're now doing with the init_pudl.py script. There could be a quickie test settings file (only a couple of years or states for each data source) and a full test settings file.

Another option that might be interesting to implement is randomized testing -- since in aggregate we run the tests many times, but we don't want to necessarily always avoid testing large chunks of the datasets, we could instead tell the quick tests to just choose 2 years at random, and one state (for CEMS) at random, and use that data for testing purposes. For this to work with the output testing, we would need to choose two contiguous years, or change the outputs to take a list of years, rather than a start and end date.

cmgosnell commented 4 years ago

I'm closing this issue. Feels duplicative with Issue #347 at this point.