ncsu-landscape-dynamics / r.pops.spread

r.pops.spread - PoPS model implemented as a GRASS GIS module
https://grass.osgeo.org/grass7/manuals/addons/r.pops.spread.html
GNU General Public License v2.0
4 stars 0 forks source link

update after syncing to pops-core v1.1.0 #41

Closed petrasovaa closed 3 years ago

wenzeslaus commented 3 years ago

Now the differences in the existing tests are:

======================================================================
FAIL: test_outputs_mortality (__main__.TestSpread)
----------------------------------------------------------------------
AssertionError: r.univar map=average percentile=90.0 separator== -g difference:
mismatch values (key, reference, actual): [('mean', 0.629, 0.605603201829631)]
======================================================================
FAIL: test_outputs_mortality_treatment (__main__.TestSpread)
----------------------------------------------------------------------
AssertionError: r.univar map=average percentile=90.0 separator== -g difference:
mismatch values (key, reference, actual): [('mean', 0.509, 0.492502858776459)]

The new test was created from an average of 100 runs with tolerance set to make fit each of another 100 runs separately in the old code. The new test currently runs 100 runs in the test itself. The new code passes the new test, but running 100 individual runs (as opposed to average) fails the test.

Conclusion is that there are likely differences beyond the differences in different seeds, but they are small enough to be unnoticeable in practice.

wenzeslaus commented 3 years ago

Tests now have new values which let the current code pass. The new values are well within the tolerance determined for the many runs test, i.e., the change in values is within the differences between stochastic runs.

As described above, running (manually) the tests written for many stochastic runs with individual runs (instead of an average from many runs) fails. Adding more runs to the determination of tolerance would likely increase it and may make the tests pass, but running that many individual runs would likely fail again given that's what happened with running between 10 and 100 runs.

I was not able to determine the source of the differences. I didn't try to narrow it down further by trying to split the changes to minimal change sets because that would require much more time. Given that the changes are small in comparison to stochastic run differneces, that the test for average of many stochastic runs passes, that the major code changes are in PoPS Core not in r.pops.spread (and thus in sync with rpops), and that we generally don't expect to keep exactly same results between different versions, I intent to merge it now as is, i.e., with the different results.