Double mutation rate - Githubissues

ms609 commented 1 month ago

It struck me that test five would be a stronger test if it also evaluated whether rates other than unity produced expected output. I thus tinkered (quickly) with the source as below, to see whether a rate of 2 would lead to double the change. (I'm not sure that this is necessarily a linear relationship, but it's probably a good enough approximation for these purposes.)

I saw a few things I didn't understand:

The test passed when I ran it within Qt:
If I changed the pass conditions to guarantee a failure, test five passed, whilst test eighteen failed (?) – output below
The test failed when I built and deployed the software (which seems to contradict the assertion I think I recall seeing elsewhere that the software couldn't be built if tests were failing).


17:00:48: Starting C:\Users\pjjg18\GitHub\trevosim\build\Desktop_Qt_6_7_1_MSVC2019_64bit-Release\TREvoSimTest.exe...
Running main() from C:\Users\pjjg18\GitHub\trevosim\build\Desktop_Qt_6_7_1_MSVC2019_64bit-Release\googletest-src\googletest\src\gtest_main.cc
[==========] Running 21 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 21 tests from testsuite
[ RUN      ] testsuite.TREvoSimTestZero
[       OK ] testsuite.TREvoSimTestZero (1 ms)
[ RUN      ] testsuite.TREvoSimTestOne
[       OK ] testsuite.TREvoSimTestOne (0 ms)
[ RUN      ] testsuite.TREvoSimTestTwo
[       OK ] testsuite.TREvoSimTestTwo (18 ms)
[ RUN      ] testsuite.TREvoSimTestThree
[       OK ] testsuite.TREvoSimTestThree (8 ms)
[ RUN      ] testsuite.TREvoSimTestFour
[       OK ] testsuite.TREvoSimTestFour (0 ms)
[ RUN      ] testsuite.TREvoSimTestFive
[       OK ] testsuite.TREvoSimTestFive (109 ms)
[ RUN      ] testsuite.TREvoSimTestSix
[       OK ] testsuite.TREvoSimTestSix (185 ms)
[ RUN      ] testsuite.TREvoSimTestSeven
[       OK ] testsuite.TREvoSimTestSeven (0 ms)
[ RUN      ] testsuite.TREvoSimTestEight
[       OK ] testsuite.TREvoSimTestEight (80 ms)
[ RUN      ] testsuite.TREvoSimTestNine
[       OK ] testsuite.TREvoSimTestNine (2 ms)
[ RUN      ] testsuite.TREvoSimTestTen
[       OK ] testsuite.TREvoSimTestTen (0 ms)
[ RUN      ] testsuite.TREvoSimTestEleven
[       OK ] testsuite.TREvoSimTestEleven (0 ms)
[ RUN      ] testsuite.TREvoSimTestTwelve
[       OK ] testsuite.TREvoSimTestTwelve (0 ms)
[ RUN      ] testsuite.TREvoSimTestThirteen
[       OK ] testsuite.TREvoSimTestThirteen (0 ms)
[ RUN      ] testsuite.TREvoSimTestForteen
[       OK ] testsuite.TREvoSimTestForteen (0 ms)
[ RUN      ] testsuite.TREvoSimTestFifteen
[       OK ] testsuite.TREvoSimTestFifteen (0 ms)
[ RUN      ] testsuite.TREvoSimTestSixteen
[       OK ] testsuite.TREvoSimTestSixteen (0 ms)
[ RUN      ] testsuite.TREvoSimTestSeventeen
[       OK ] testsuite.TREvoSimTestSeventeen (13 ms)
[ RUN      ] testsuite.TREvoSimTestEighteen
C:\Users\pjjg18\GitHub\trevosim\testsuite.cpp(154): error: Value of: result
  Actual: false
Expected: true
Check playing field mixing. The mixing mechanism relies on random numbers to provide a probability of mixing, and as such, on occasions, this test will fail due to the stochastic nature of the process. If this happens, you may want to repeat the test again and see if the warnings dissappear.

Playing field zero was originally 100 genomes, all zero. mixingProbabilityOneToZero was set to 20 and mixing applied 100 times, so ~20 should have been overwritten, and the count of all zero genomes should be ~80. It is 83
Playing field one was originally 100 genomes, all ones. This should still be the same as mixingProbabilityZeroToOne is zero, and so we should count no all zero genomes. Count is 0

Playing field zero was originally 100 genomes, all zero. This should still be the same as mixingProbabilityOneToZero is zero, so none should have been overwritten, and the count of all zero genomes should be 100. It is 100
Playing field one was originally 100 genomes, all ones. mixingProbabilityZeroToOne was set to 20 and and mixing applied 100 times, and so we should count ~20 all zero genomes. Count is 14

This number seems off what we should expect, although since we're dealing with random numbers, there may be nothing untoward - try repating test

Now testing three playing fields. Playing field mixing was set to fifty, then repeated 100 times, and PF 0 and 3 were all zeros, PF1 was all ones. As such, we should have ~25 mixed individuals in PF0 & PF3 (though slightly fewer in PF0 as they have been overwritten by those from PF3). Playing field zero count of organisms that are all zero should be ~75-85. It is 85
PF1 was originally 100 genomes, all ones. Some of these will have been overwritten from PF0 and PF3 - around 35 all zeros would be sensible. Count is 44

PF3 is similar to PF1, and and thus there should be ~25 all ones. Count of all zeros is 85

C:\Users\pjjg18\GitHub\trevosim\testsuite.cpp(154): error: Value of: result
  Actual: false
Expected: true
Check playing field mixing. The mixing mechanism relies on random numbers to provide a probability of mixing, and as such, on occasions, this test will fail due to the stochastic nature of the process. If this happens, you may want to repeat the test again and see if the warnings dissappear.

Playing field zero was originally 100 genomes, all zero. mixingProbabilityOneToZero was set to 20 and mixing applied 100 times, so ~20 should have been overwritten, and the count of all zero genomes should be ~80. It is 83
Playing field one was originally 100 genomes, all ones. This should still be the same as mixingProbabilityZeroToOne is zero, and so we should count no all zero genomes. Count is 0

Playing field zero was originally 100 genomes, all zero. This should still be the same as mixingProbabilityOneToZero is zero, so none should have been overwritten, and the count of all zero genomes should be 100. It is 100
Playing field one was originally 100 genomes, all ones. mixingProbabilityZeroToOne was set to 20 and and mixing applied 100 times, and so we should count ~20 all zero genomes. Count is 14

This number seems off what we should expect, although since we're dealing with random numbers, there may be nothing untoward - try repating test

Now testing three playing fields. Playing field mixing was set to fifty, then repeated 100 times, and PF 0 and 3 were all zeros, PF1 was all ones. As such, we should have ~25 mixed individuals in PF0 & PF3 (though slightly fewer in PF0 as they have been overwritten by those from PF3). Playing field zero count of organisms that are all zero should be ~75-85. It is 85
PF1 was originally 100 genomes, all ones. Some of these will have been overwritten from PF0 and PF3 - around 35 all zeros would be sensible. Count is 44

PF3 is similar to PF1, and and thus there should be ~25 all ones. Count of all zeros is 85

[  FAILED  ] testsuite.TREvoSimTestEighteen (23 ms)
[ RUN      ] testsuite.TREvoSimTestNineteen
[       OK ] testsuite.TREvoSimTestNineteen (0 ms)
[ RUN      ] testsuite.TREvoSimTestTwenty
[       OK ] testsuite.TREvoSimTestTwenty (0 ms)
[----------] 21 tests from testsuite (447 ms total)

[----------] Global test environment tear-down
[==========] 21 tests from 1 test suite ran. (447 ms total)
[  PASSED  ] 20 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] testsuite.TREvoSimTestEighteen

 1 FAILED TEST
17:00:49: C:\Users\pjjg18\GitHub\trevosim\build\Desktop_Qt_6_7_1_MSVC2019_64bit-Release\TREvoSimTest.exe exited with code 1

17:01:20: Starting C:\Users\pjjg18\GitHub\trevosim\build\Desktop_Qt_6_7_1_MSVC2019_64bit-Release\TREvoSimTest.exe...
Running main() from C:\Users\pjjg18\GitHub\trevosim\build\Desktop_Qt_6_7_1_MSVC2019_64bit-Release\googletest-src\googletest\src\gtest_main.cc
[==========] Running 21 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 21 tests from testsuite
[ RUN      ] testsuite.TREvoSimTestZero
[       OK ] testsuite.TREvoSimTestZero (1 ms)
[ RUN      ] testsuite.TREvoSimTestOne
[       OK ] testsuite.TREvoSimTestOne (0 ms)
[ RUN      ] testsuite.TREvoSimTestTwo
[       OK ] testsuite.TREvoSimTestTwo (18 ms)
[ RUN      ] testsuite.TREvoSimTestThree
[       OK ] testsuite.TREvoSimTestThree (9 ms)
[ RUN      ] testsuite.TREvoSimTestFour
[       OK ] testsuite.TREvoSimTestFour (0 ms)
[ RUN      ] testsuite.TREvoSimTestFive
[       OK ] testsuite.TREvoSimTestFive (109 ms)
[ RUN      ] testsuite.TREvoSimTestSix
[       OK ] testsuite.TREvoSimTestSix (171 ms)
[ RUN      ] testsuite.TREvoSimTestSeven
[       OK ] testsuite.TREvoSimTestSeven (0 ms)
[ RUN      ] testsuite.TREvoSimTestEight
[       OK ] testsuite.TREvoSimTestEight (73 ms)
[ RUN      ] testsuite.TREvoSimTestNine
[       OK ] testsuite.TREvoSimTestNine (2 ms)
[ RUN      ] testsuite.TREvoSimTestTen
[       OK ] testsuite.TREvoSimTestTen (0 ms)
[ RUN      ] testsuite.TREvoSimTestEleven
[       OK ] testsuite.TREvoSimTestEleven (0 ms)
[ RUN      ] testsuite.TREvoSimTestTwelve
[       OK ] testsuite.TREvoSimTestTwelve (0 ms)
[ RUN      ] testsuite.TREvoSimTestThirteen
[       OK ] testsuite.TREvoSimTestThirteen (0 ms)
[ RUN      ] testsuite.TREvoSimTestForteen
[       OK ] testsuite.TREvoSimTestForteen (0 ms)
[ RUN      ] testsuite.TREvoSimTestFifteen
[       OK ] testsuite.TREvoSimTestFifteen (0 ms)
[ RUN      ] testsuite.TREvoSimTestSixteen
[       OK ] testsuite.TREvoSimTestSixteen (0 ms)
[ RUN      ] testsuite.TREvoSimTestSeventeen
[       OK ] testsuite.TREvoSimTestSeventeen (13 ms)
[ RUN      ] testsuite.TREvoSimTestEighteen
C:\Users\pjjg18\GitHub\trevosim\testsuite.cpp(154): error: Value of: result
  Actual: false
Expected: true
Check playing field mixing. The mixing mechanism relies on random numbers to provide a probability of mixing, and as such, on occasions, this test will fail due to the stochastic nature of the process. If this happens, you may want to repeat the test again and see if the warnings dissappear.

Playing field zero was originally 100 genomes, all zero. mixingProbabilityOneToZero was set to 20 and mixing applied 100 times, so ~20 should have been overwritten, and the count of all zero genomes should be ~80. It is 85
Playing field one was originally 100 genomes, all ones. This should still be the same as mixingProbabilityZeroToOne is zero, and so we should count no all zero genomes. Count is 0

Playing field zero was originally 100 genomes, all zero. This should still be the same as mixingProbabilityOneToZero is zero, so none should have been overwritten, and the count of all zero genomes should be 100. It is 100
Playing field one was originally 100 genomes, all ones. mixingProbabilityZeroToOne was set to 20 and and mixing applied 100 times, and so we should count ~20 all zero genomes. Count is 13

This number seems off what we should expect, although since we're dealing with random numbers, there may be nothing untoward - try repating test

Now testing three playing fields. Playing field mixing was set to fifty, then repeated 100 times, and PF 0 and 3 were all zeros, PF1 was all ones. As such, we should have ~25 mixed individuals in PF0 & PF3 (though slightly fewer in PF0 as they have been overwritten by those from PF3). Playing field zero count of organisms that are all zero should be ~75-85. It is 83
PF1 was originally 100 genomes, all ones. Some of these will have been overwritten from PF0 and PF3 - around 35 all zeros would be sensible. Count is 34

PF3 is similar to PF1, and and thus there should be ~25 all ones. Count of all zeros is 81

[  FAILED  ] testsuite.TREvoSimTestEighteen (22 ms)
[ RUN      ] testsuite.TREvoSimTestNineteen
[       OK ] testsuite.TREvoSimTestNineteen (0 ms)
[ RUN      ] testsuite.TREvoSimTestTwenty
[       OK ] testsuite.TREvoSimTestTwenty (0 ms)
[----------] 21 tests from testsuite (425 ms total)

[----------] Global test environment tear-down
[==========] 21 tests from 1 test suite ran. (425 ms total)
[  PASSED  ] 20 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] testsuite.TREvoSimTestEighteen

 1 FAILED TEST
C:\Users\pjjg18\GitHub\trevosim\testsuite.cpp(154): error: Value of: result
  Actual: false
Expected: true

RussellGarwood commented 1 month ago

I'll need to look into most points here - but we had to change the failed build on test fail in order to compile this on mac, as I recall. This was also prefereable so every build didn't take ~30 seconds / to avoid us having to comment out the test command whilst doing development to avoid this.

RussellGarwood commented 4 weeks ago

You are correct in that this test could be strengthened - I shall do this shortly. On my end, if I make the test fail by changing return to false, I get the following:

Running from Qt creator, and within the software the expected fail via the GUI. In the code in the pull request, I think the test should pass - not sure if that is related to this? I've since tweaked test 18 too, and so it could be that one was failing because the test was failing.

ms609 commented 2 weeks ago

OK, I'm now seeing consistent behaviour of the test suite; if I set a test to return false, that test (and only that test) fails.

Test 18 is still stochastic; sometimes it passes, sometimes not.

palaeoware / trevosim

Double mutation rate #51