For reproducibility, a seed is given as input into SuStaIn, and used throughout so that results are consistent for a given seed (and parameters).
This enables a full functional test of SuStaIn. I've added two scripts to check the output from SuStaIn (in the "tests" subfolder): create_validation.py is for creating new validation benchmarks, and validation.py is for checking results are consistent with these benchmarks.
A full test (-f command line flag for validation.py) uses every class that inherits from AbstractSustain. The relevant test functions are @abstractmethods, so this should scale with future additions/subclasses. This also means that the simulator functions are now a part of the class they pertain to (though I haven't removed the "sim" subfolder for now).
Note that the test for ZscoreSustainMissingData is currently a copy of the ZscoreSustain one, and in the near future will be modified to better test the missing data handling.
Optimizations
Part of the motivation for the tests was to make a couple of optimizations, namely vectorizing a couple of the main loops.
In a test experiment, the running time was reduced in the z-score version from ~15.5 hours to ~10 hours. The speed up is less significant for the mixture model version (~2.1 hours to ~1.68 hours).
Feel free to ask if you want details, but it's a lot to add here and ultimately not interesting.
Other Changes
Replaced the print statements for iterations with tqdm, as it's more informative and compact.
Adding a plot argument to AbstractSustain.run_sustain_algorithm to avoid always plotting.
Tests
For reproducibility, a seed is given as input into SuStaIn, and used throughout so that results are consistent for a given seed (and parameters).
This enables a full functional test of SuStaIn. I've added two scripts to check the output from SuStaIn (in the "tests" subfolder):
create_validation.py
is for creating new validation benchmarks, andvalidation.py
is for checking results are consistent with these benchmarks.A full test (
-f
command line flag forvalidation.py
) uses every class that inherits fromAbstractSustain
. The relevant test functions are@abstractmethod
s, so this should scale with future additions/subclasses. This also means that the simulator functions are now a part of the class they pertain to (though I haven't removed the "sim" subfolder for now).Note that the test for
ZscoreSustainMissingData
is currently a copy of theZscoreSustain
one, and in the near future will be modified to better test the missing data handling.Optimizations
Part of the motivation for the tests was to make a couple of optimizations, namely vectorizing a couple of the main loops.
In a test experiment, the running time was reduced in the z-score version from ~15.5 hours to ~10 hours. The speed up is less significant for the mixture model version (~2.1 hours to ~1.68 hours).
Feel free to ask if you want details, but it's a lot to add here and ultimately not interesting.
Other Changes
tqdm
, as it's more informative and compact.plot
argument toAbstractSustain.run_sustain_algorithm
to avoid always plotting.