For Travis CI, I'm envisioning running compute_cols.py with 1 instance, 2 instances, and 5 instances. After #3 is done, the three different history files will be compared to each other with --strict=exact (changing instance count should not change output). However, we will also want to compare the latest code to an existing baseline -- where should this baseline live? As in #6, we could keep the baseline in this repository or store it elsewhere and then have Travis download it before doing the comparison.
Bonus food for thought: the baseline comparison should be a totally separate Travis test so it's easy to separate commits that fail run_test_suite.sh from those that are merely answer-changing and fail the baseline comparison.
For Travis CI, I'm envisioning running
compute_cols.py
with 1 instance, 2 instances, and 5 instances. After #3 is done, the three different history files will be compared to each other with--strict=exact
(changing instance count should not change output). However, we will also want to compare the latest code to an existing baseline -- where should this baseline live? As in #6, we could keep the baseline in this repository or store it elsewhere and then have Travis download it before doing the comparison.Bonus food for thought: the baseline comparison should be a totally separate Travis test so it's easy to separate commits that fail
run_test_suite.sh
from those that are merely answer-changing and fail the baseline comparison.