Random seed related test failure

tkoskela commented 2 years ago

On 41dae34 test #8 failed on my Mac on the first time I ran it. All following runs passed so I could not reproduce. Could this be caused by the random seed? @MatthijsMars @CosmoMatt

8/39 Test  #8: test_sdmm ............................................***Failed    0.49 sec
[2022-03-18 11:12:38.174] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.175] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.203] [sopt] [critical] Performing SDMM 
[2022-03-18 11:12:38.244] [sopt] [critical] Performing SDMM 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_sdmm is a Catch v2.13.7 host application.
Run with -? for options

Randomness seeded to: 1647601958

-------------------------------------------------------------------------------
SDMM with ||x - x0||_2 functions
  With different L
-------------------------------------------------------------------------------
/Users/tkoskela/git_repos/sopt/cpp/tests/sdmm.cc:211
...............................................................................

/Users/tkoskela/git_repos/sopt/cpp/tests/sdmm.cc:229: FAILED:
  CHECK( func(result) <= func(result + epsilon) )
with expansion:
  3.7534305808 <= 3.7534304842
with messages:
  result.transpose() :=    -1.647  0.311289   0.87681 -0.863347
  epsilon.transpose() :=      0 0.0001      0      0

/Users/tkoskela/git_repos/sopt/cpp/tests/sdmm.cc:230: FAILED:
  CHECK( func(result) <= func(result - epsilon) )
with expansion:
  3.7534305808 <= 3.7534304071
with messages:
  result.transpose() :=    -1.647  0.311289   0.87681 -0.863347
  epsilon.transpose() :=      0      0      0 0.0001

===============================================================================
test cases:  3 |  2 passed | 1 failed
assertions: 85 | 83 passed | 2 failed

Originally posted by @tkoskela in https://github.com/astro-informatics/sopt/issues/265#issuecomment-1072323584

[ ] TODO: Reproduce the failure(s) by inserting the random seed into the tests

tkoskela commented 2 years ago

From Matthjis:

Looking at the test I think that might possibly be caused by a bad random seed. Still, I think it is kind of strange since, this should always converge to a solution for a simple problem like this, but if the fault cannot be reproduced it might have been a particularly bad operator or data vector.

tkoskela commented 2 years ago

Another possible candidate in https://github.com/astro-informatics/sopt/runs/5886037546?check_suite_focus=true

tkoskela commented 2 years ago

Possible fix: Make tolerance of the test lower than the tolerance for convergence

tkoskela commented 2 years ago

One reason for errors is probably that different metrics are used in convergence (l1 norm, normalized against one of the vectors) and the Eigen isApprox method (l2 norm, normalized against the min of the vectors).

The original error in this issue is different. It seems that the iterations either 1) has not converged or 2) is using a too large step size in comparison to epsilon

tkoskela commented 2 years ago

@jasonmcewen thinks we can just fix the random seed to get rid of these issues.

SJaffa commented 2 years ago

So just find a set of actions that did pass, get it's seed and set that as a the seed for future actions? Should we still change the tolerances?

SJaffa commented 2 years ago

List of tests that fail intermittently:

[ ] test_sdmm (this issue, on Tuomas's Mac, and on GH actions)
[ ] test_power_method
[ ] test_primal_dual and earlier
[ ] test_proximal and earlier
[ ] test_credible_region

tkoskela commented 2 years ago

List of tests that fail intermittently:

[ ] test_sdmm (this issue, on Tuomas's Mac, and on GH actions)

[ ] test_power_method

[ ] test_primal_dual and earlier

[ ] test_proximal and earlier

Are there any other tests that are generating random numbers?

I would still first relax the tolerances and confirm that is a solution to our issue, so that at least we've correctly identified what's going on.

As a second step, let's fix the random seed for all the tests using random numbers. It's good practice to do anyways, it will make the tests easier to reproduce. I suppose we could test with a few different random seeds, but I'm not sure that is really worth it.

tkoskela commented 2 years ago

Strangely these all seem to have gone away in #281 and #286. I wonder if these are triggered by something else going wrong in the build. 🤷

tkoskela commented 2 years ago

Nevermind, got another one!

test_primal_dual error is in the 3rd decimal

astro-informatics / sopt

Random seed related test failure #270