Closed karldw closed 5 years ago
Some of the tests take a ton of memory, but I'm guessing they won't be in the fast subset.
I have TravisCI set up and working. Currently, it will run any test marked ci
and the only test so marked at the moment is a Hello World. Now we need to start creating a suite of tests that can actually be run on Travis and other times when we just want to test our code, and not load the entire universe of data.
What do we think that looks like? Could we do one year's worth of data, or is that still too much? Maybe one year of FERC & EIA, plus one year of one state's worth of CEMS? It looks like the Travis platform has network access -- would we be allowed to just have the test download fresh data directly from the federal sources, rather than having it checked in to the repository? Not sure what constitutes reasonable use on Travis. That would have the advantage of also testing the download / datastore process.
The Travis CI builds are working. Rather than categorize some tests as fast, I created (at least for now) a separate travis_ci_test.py module, which runs the ETL on a small subset of all the data -- which is downloaded to a local datastore on the Travis VM. Right now it does FERC Form 1 & EIA860 for 2012 and 2016, and EIA923 EPA CEMS for just 2016. In addition, the CEMS only pulls Colorado's data, to minimize the time the whole thing takes to run, disk use, etc.
Great! The only advantage of adding AppVeyor would be also testing on Windows.
For the simpler functions, we could add tests with dummy datasets.