implicitly fuzz RNG-dependent doctests with a random random seed

dimpase commented 4 years ago

Our documentation and tests include a variety of examples and tests involving randomness.

Those depend on a randomness seed, and up to Sage 9.1 they always used the same one: 0. Thus every run of sage -t or make [p]test[all][long] would run those "random" doctests deterministically. In many cases, the output of those tests even relied on that. As a result, random examples and tests were actually testing that they were run non-randomly!

This is reminiscent of an xkcd comic illustration of random number generators.

Based on these considerations and the related sage-devel discussion, we propose to:

allow specifying a randomness when running tests (#29962),
adapt tests involving randomness, making sure they test mathematical functionality independent of the randomness seed used to run them (see roadmap),
default to a random randomness seed when none is specified (present ticket),

thus making those tests more robust and more useful, by becoming more likely to reveal bugs in a variety of cases including corner cases.

The first step (see #29962, merged in Sage 9.2) adds a --random-seed flag to sage -t, allowing:

sage -t --long --random-seed=9876543210 src/sage/

Still, when no randomness seed is specified, the default seed 0 is used. This means most testers test with the same randomness seed, making "random" doctests still mostly deterministic in practice.

Here is a way to pick a random randomness seed and run tests with it (can be used to work on the tickets in the roadmap):

$ randseed() {
    sage -c "import sage.misc.randstate as randstate; \
             randstate.set_random_seed(); \
             print(randstate.initial_seed())";
    }
$ SEED=$(randseed)
$ DIR=src/sage
$ echo "$ sage -t --long --random-seed=${SEED} ${DIR}" \
  && ./sage -t --long --random-seed=${SEED} ${DIR}

Once examples and tests involving randomness have been adapted, the present ticket puts the final touch by making it so that when running tests with no seed specified, a random one will be used:

sage -t src/sage/all.py 
Running doctests with ID 2020-06-23-23-19-03-8003eea5.
...
sage -t --warn-long 89.5 --random-seed=273987373473433732996760115183658447263 src/sage/all.py
    [16 tests, 0.73 s]
----------------------------------------------------------------------
All tests passed!
----------------------------------------------------------------------
Total time for all tests: 0.8 seconds
    cpu time: 0.7 seconds
    cumulative wall time: 0.7 seconds

Being displayed in the output, the seed used can be used again if needed:

sage -t --warn-long 89.5 --random-seed=273987373473433732996760115183658447263 src/sage/all.py

allowing to explore any problematic case revealed by running the tests with that seed.

Roadmap:

Allow fuzzing: #29962
Make all parts of sage ready for default fuzzing:
- 29945: coding
- 29963: geometry
- 29964: libs
- 29965: graphs
- 32107: groups
- 29967: interfaces
- 29968: algebras
- 29969: misc
- 29970: arith
- 29971: categories
- 29972: stats
- 29973: sets
- 29974: combinat
- 29975: numerical, probability
- 29976: matrix
- 29977: modular
- 29978: modules
- 29979: rings
- 29980: crypto
- 29981: documentation
- 29982: dynamics
- 29983: finance
- 29984: symbolic
- 29985: schemes
- 29986: plot
- 32188: various missed tests along the way
32216: Update the developers guide for implicitly fuzzing doctests
Finally make fuzzing default with this ticket.

Follow-up:

32544: Meta-ticket: Fix unstable doctests detected after #29935
Remove set_random_seed in doctests where possible.

Errors discovered by this ticket:

29936: Hyperbolic geometry bugs revealed by fuzzing
29945: Failing doctest in src/sage/coding/linear_code.py

(just a wrong doctest, will be fixed here)
29954: Unstable plotting
29956: Bug in KlyachkoBundle_class.random_deformation
29957: Bug in ContinuedFraction rounding
29958: Too many strong articulation points
29961: Random symbolic expression is completely unstable
30045: Bug in Reed-Solomon encoder and error-erasure decoder
Index error with random derangement (fixed in #29974)
31890: simplify_hypergeometric is unstable
31891: ZeroDivisonError when creating polynomial system
31892: Conic parametrization broken
32075: Polynomial generic power trunk broken
32083: Various errors with polybori including segmentation fault
32084 _nth_root_naive fails for integer mod
32085: Errors when computing norms of padic elements
32086: apply_homography unstable for continue fraction
32095: DiFUB algorithm fails on some random graph
32108: Fix random tree on one or less vertices
32109: Fix 0/0 in ore function field
32111: Unstable minimal polynomial for element of 2-adic Eisenstein Extension Field in pi defined by x^4 - 2*a
32117: Random relative number field checks only irreducibility over QQ
32118: AlgebraicForm checks invariance with random matrix that can be the identity
32124: SL2Z.random_element unstable
32125: random Ore polynomials do not respect minimum degree bound
32126: padic QpLC.random_element is broken
32127: gosper_iterator of continued fractions is unstable
32129: sage_input is unreliable for elements of ComplexField
32131: Cut width of graph with one edge incorrect
32132: Wrong gyration orbit length
32138: is_groebner fails over fraction fields
32141: Unstable doctest involving permutation groups
32169: Bug in edge disjoint spanning trees
32185: Failing weak order assertion on random symbolic expression
32186: Random bounded tolerance graph
32187: permutation group generated by list perms in L of degree n incorrect when compared to GAP
32657: plot_vector_field unstable

Depends on #32667

CC: @kliem @orlitzky @DaveWitteMorris @slel @mantepse

Component: doctest framework

Keywords: fuzz, random, seed

Author: Jonathan Kliem

Branch/Commit: 047379c

Reviewer: Michael Orlitzky

Issue created by migration from https://trac.sagemath.org/ticket/29935

dimpase commented 4 years ago

Description changed:

--- 
+++ 
@@ -5,3 +5,21 @@

 See  this thread: https://groups.google.com/d/msg/sage-devel/c4UbKSdt3Aw/UQAo1iYoAAAJ
 for more details.
+
+The doctest framework modification is trivial:
+
+```diff
+--- a/src/sage/doctest/forker.py
++++ b/src/sage/doctest/forker.py
+@@ -848,7 +848,7 @@ class SageDocTestRunner(doctest.DocTestRunner, object):
+             TestResults(failed=0, attempted=4)
+         """
+         self.setters = {}
+-        randstate.set_random_seed(0)
++        randstate.set_random_seed()
+         # scipy 1.18 introduced reprecation warnings on a number of things they are moving to
+         # numpy, e.g. DeprecationWarning: scipy.array is deprecated
+         #             and will be removed in SciPy 2.0.0, use numpy.array instead
+```
+
+but many doctests need to be fixed.

kliem commented 4 years ago

comment:3

Just so that we don't forget that we will have to update

https://doc.sagemath.org/html/en/developer/coding_basics.html

kliem commented 4 years ago

Work Issues: modify coding conventions

kliem commented 4 years ago

Dependencies: #29904

kliem commented 4 years ago

comment:5

geometry/hyperbolic_space/hyperbolic_geodesic.py makes some claims on how good approximations work, which doesn't seem to work. I had this thing run 7 times now and not one time did all tests pass. E.g. there is one test that gives me absolute errors of 0.7 when running this in the shell, but the test claims it is below 10**-9.

There is also #29904. Other than that the geometry module appears to ready.

Edit: And there is some place where you need to add set_random_seed(0) now or similar.

dimpase commented 4 years ago

comment:6

Replying to @kliem:

geometry/hyperbolic_space/hyperbolic_geodesic.py makes some claims on how good approximations work, which doesn't seem to work. I had this thing run 7 times now and not one time did all tests pass. E.g. there is one test that gives me absolute errors of 0.7 when running this in the shell, but the test claims it is below 10**-9.

this is a great example of fuzzing catching an apparent error, it seems.

There is also #29904. Other than that the geometry module appears to ready.

Edit: And there is some place where you need to add set_random_seed(0) now or similar.

kliem commented 4 years ago

New commits:

`d5fc5be`	`start from a "random" random seed for doctesting`
`5c7e562`	`fix double description of hypercube`
`6b41bdb`	`Merge branch 'public/29904' of git://trac.sagemath.org/sage into public/29935`
`b2954ce`	`fix random test in geometry/linear_expression`

kliem commented 4 years ago

Branch: public/29935

kliem commented 4 years ago

Commit: b2954ce

kliem commented 4 years ago

comment:8

Even in the developer guide there is an example, where it is confusing:

sage: M = matrix.identity(3) + random_matrix(RR,3,3)/10^3
sage: M^2 # abs tol 1e-2
[1 0 0]
[0 1 0]
[0 0 1]

There is no mentioning there that only one particular matrix is tested.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

dc49dd0 update developer guide

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Changed commit from b2954ce to dc49dd0

kliem commented 4 years ago

Changed work issues from modify coding conventions to none

dimpase commented 4 years ago

comment:11

Replying to @kliem:

Even in the developer guide there is an example, where it is confusing:
sage: M = matrix.identity(3) + random_matrix(RR,3,3)/10^3
sage: M^2 # abs tol 1e-2
[1 0 0]
[0 1 0]
[0 0 1]
There is no mentioning there that only one particular matrix is tested.

indeed it assumes (?) that the entries of the random matrix are small in abs value

kliem commented 4 years ago

comment:12

Nothing to assume there. random_matrix(RR,3,3) has entries between -1 and 1. After dividing we are good.

kliem commented 4 years ago

Changed dependencies from #29904 to #29904, #29936

mkoeppe commented 4 years ago

comment:14

How about making the random seed an option to sage-runtests. Then people can fuzz the doctests if they want.

kliem commented 4 years ago

comment:15

Will people do that? Especially enough to find incorrect claims. If you look into #29936 we have been claiming a preciseness that is far from true. Even if you consider the relative error instead of the absolute error. fuzzed doctests would have easily detected that. Even worse. Maybe the situation was better at some point and we never caught on about the regression.

dimpase commented 4 years ago

comment:16

it is not a "real" (time-consuming) fuzzing, it is making the main random seed nondeterministic in tests. making it optional is akin to making tests optional.

we already found a couple of examples which show usefulness of this change in detecting bugs in Sage.

mkoeppe commented 4 years ago

comment:17

The option that I propose to add needs to be added in any case, so that when a failure is revealed by a random random seeds, we can reproduce the failure with that seed.

dimpase commented 4 years ago

comment:18

it is a good point - and the non-det. seed needs to be logged - if this is not yet in the branch it should get there.

mkoeppe commented 4 years ago

comment:19

So I would suggest to use this ticket to introduce the option, not change the default.

dimpase commented 4 years ago

comment:20

no, I don't agree, it makes no sense to test less if one can test more, and also not having it default would not force people to make their tests robust.

mkoeppe commented 4 years ago

comment:21

By the way I don't think most doctests have the ambition to demonstrate correctness of mathematical claims. In my opinion, if a doctest intends to do that, it should run an explicit loop of several tests. Still deterministically.

mkoeppe commented 4 years ago

comment:22

Why would creating the option and changing the default have to be done on the same ticket? I think it's best practices to separate the two steps on two tickets.

kliem commented 4 years ago

comment:23

Feel free to do with the branch whatever you think is reasonable.

kliem commented 4 years ago

comment:24

What I think would be great if running we could run each file at some "random" seed, e.g.

sage -t src/sage/all.py 
Running doctests with ID 2020-06-22-18-56-54-668a26f3.
Git branch: public/29935
Using --optional=4ti2,bliss,build,dochtml,e_antic,flask,flask_autoindex,flask_babel,flask_oldsessions,flask_openid,flask_silk,latte_int,lidia,lrslib,memlimit,normaliz,pynormaliz,python_openid,sage,speaklater,twisted
Doctesting 1 file.
sage -t --warn-long 91.5 --seed 123456 src/sage/all.py
    [16 tests, 0.76 s]
----------------------------------------------------------------------
All tests passed!
----------------------------------------------------------------------
Total time for all tests: 0.8 seconds
    cpu time: 0.8 seconds
    cumulative wall time: 0.8 seconds

You can reproduce this test with

sage -t --warn-long 91.5 --seed 123456 src/sage/all.py

Is a bit annoying, as seeds can be long though. Then whenever a bot or someone discovers a failure, you can reproduce it. At least if it really was caused by the random seed.

orlitzky commented 4 years ago

comment:25

Just to weigh in:

The default should be to use a randomly-chosen seed.
Failures should be reproducible: a. It should be possible to pass a seed to the test runner with e.g. --seed. b. Any test failures should log/echo the current seed so that the failing test can be reproduced easily using the option in (2.a.).

The sooner we can push this through the better. Tests that don't work due to bugs can even be deleted and replaced with trac tickets saying "this never worked" IMO.

kliem commented 4 years ago

comment:26

With #29904 (already pulled into this branch), #29936 (not yet pulled) and b2954ceea638b4510e99d90cea3f99fa9a49338 (commit from above) geometry should be ready for this.

Edit: The updated branch of #29904 (not yet pulled) doesn't have set_random_seed anymore. But there is plenty of set_random_seed for cones doctests? Should I remove those as well or do we do this somewhere else?

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

bac708e make graphs ready for fuzz doctests

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Changed commit from dc49dd0 to bac708e

mkoeppe commented 4 years ago

comment:28

Replying to @kliem:

there is plenty of set_random_seed for cones doctests? Should I remove those as well or do we do this somewhere else?

Not on the same ticket.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Changed commit from bac708e to b5e9896

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

b5e9896 make algebras fuzz ready

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Changed commit from b5e9896 to 8babc2a

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

8babc2a make arith fuzz ready

kliem commented 4 years ago

comment:31

Please tell me, if I should open seperate tickets for those or something else. I just felt like starting with those. I'm done for today.

mwageringel commented 4 years ago

comment:32

Replying to @orlitzky:

The default should be to use a randomly-chosen seed.

Failures should be reproducible: a. It should be possible to pass a seed to the test runner with e.g. --seed. b. Any test failures should log/echo the current seed so that the failing test can be reproduced easily using the option in (2.a.).

+1.

Also note that Sage still uses a verbatim copy of the Python-2 version of the Random module during doctesting, which was necessary in order to obtain consistent results between Python 2 and 3. This is why the bug in #29550 went unnoticed, for example. Now that support for Python 2 has been removed, we should switch to the Python 3 version of Random. This will require updating lots of doctests, but I assume this ticket here will simplify this if it makes doctests less dependent on the seed/RNG.

kliem commented 4 years ago

comment:33

Apparently LinearCodeSyndromeDecoder.decode_to_code is broken.

See #29945.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Changed commit from 8babc2a to c7d16e9

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`815c288`	`make categories fuzz ready`
`c7d16e9`	`make coding fuzz ready`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Changed commit from c7d16e9 to e5c54af

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

`39edfdc`	`make random seed reproducible`
`e5c54af`	`document random_seed`

kliem commented 4 years ago

Description changed:

--- 
+++ 
@@ -6,20 +6,28 @@
 See  this thread: https://groups.google.com/d/msg/sage-devel/c4UbKSdt3Aw/UQAo1iYoAAAJ
 for more details.

-The doctest framework modification is trivial:
+If not set, a random seed will be chosen for doctesting:

-```diff
---- a/src/sage/doctest/forker.py
-+++ b/src/sage/doctest/forker.py
-@@ -848,7 +848,7 @@ class SageDocTestRunner(doctest.DocTestRunner, object):
-             TestResults(failed=0, attempted=4)
-         """
-         self.setters = {}
--        randstate.set_random_seed(0)
-+        randstate.set_random_seed()
-         # scipy 1.18 introduced reprecation warnings on a number of things they are moving to
-         # numpy, e.g. DeprecationWarning: scipy.array is deprecated
-         #             and will be removed in SciPy 2.0.0, use numpy.array instead
+```
+sage -t src/sage/all.py 
+Running doctests with ID 2020-06-23-23-19-03-8003eea5.
+...
+sage -t --warn-long 89.5 --random_seed=273987373473433732996760115183658447263 src/sage/all.py
+    [16 tests, 0.73 s]
+----------------------------------------------------------------------
+All tests passed!
+----------------------------------------------------------------------
+Total time for all tests: 0.8 seconds
+    cpu time: 0.7 seconds
+    cumulative wall time: 0.7 seconds

-but many doctests need to be fixed. +This can then be reproduced with + + +sage -t --warn-long 89.5 --random_seed=273987373473433732996760115183658447263 src/sage/all.py + + +Errors discovered by this ticket: +- #29939: Fix precicion for hyperbolic geodesic +- #29945: Failing doctest in src/sage/coding/linear_code.py

kliem commented 4 years ago

comment:37

I specified random_seed to distinguish from --randorder=SEED. If after --randorder=SEED the next option is --seed=SEED that sounds a bit weird to me. Although random_seed is not much better.

kliem commented 4 years ago

comment:38

Standard bracked lyndon words are set up incorrectly. I don't know if that should be consider a bug:

sage: SBLW33 = StandardBracketedLyndonWords(3,3)
sage: TestSuite(SBLW33).run()
Failure in _test_an_element:
Traceback (most recent call last):
  File "/home/jonathan/Applications/sage/local/lib/python3.7/site-packages/sage/misc/sage_unittest.py", line 296, in run
    test_method(tester=tester)
  File "/home/jonathan/Applications/sage/local/lib/python3.7/site-packages/sage/categories/sets_cat.py", line 1093, in _test_an_element
    tester.assertTrue(an_element in self, "self.an_element() is not in self")
  File "/home/jonathan/Applications/sage/local/lib/python3.7/unittest/case.py", line 692, in assertTrue
    raise self.failureException(msg)
AssertionError: self.an_element() is not in self
------------------------------------------------------------
Failure in _test_enumerated_set_contains:
Traceback (most recent call last):
  File "/home/jonathan/Applications/sage/local/lib/python3.7/site-packages/sage/misc/sage_unittest.py", line 296, in run
    test_method(tester=tester)
  File "/home/jonathan/Applications/sage/local/lib/python3.7/site-packages/sage/categories/enumerated_sets.py", line 945, in _test_enumerated_set_contains
    tester.assertIn(w, self)
  File "/home/jonathan/Applications/sage/local/lib/python3.7/unittest/case.py", line 1106, in assertIn
    self.fail(self._formatMessage(msg, standardMsg))
  File "/home/jonathan/Applications/sage/local/lib/python3.7/unittest/case.py", line 680, in fail
    raise self.failureException(msg)
AssertionError: [1, [1, 2]] not found in Standard bracketed Lyndon words from an alphabet of size 3 of length 3
------------------------------------------------------------
Failure in _test_some_elements:
Traceback (most recent call last):
  File "/home/jonathan/Applications/sage/local/lib/python3.7/site-packages/sage/misc/sage_unittest.py", line 296, in run
    test_method(tester=tester)
  File "/home/jonathan/Applications/sage/local/lib/python3.7/site-packages/sage/categories/sets_cat.py", line 1385, in _test_some_elements
    "the object %s in self.some_elements() is not in self")%(x,))
  File "/home/jonathan/Applications/sage/local/lib/python3.7/unittest/case.py", line 692, in assertTrue
    raise self.failureException(msg)
AssertionError: the object [1, [1, 2]] in self.some_elements() is not in self
------------------------------------------------------------
The following tests failed: _test_an_element, _test_enumerated_set_contains, _test_some_elements

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

93415f3 make combinat fuzz ready

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Changed commit from e5c54af to 93415f3

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Changed commit from 93415f3 to dbd7f6c

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 4 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

dbd7f6c fix doctest in sage/coding/linear_code

sagemath / sage

implicitly fuzz RNG-dependent doctests with a random random seed #29935

29945: coding

29963: geometry

29964: libs

29965: graphs

32107: groups

29967: interfaces

29968: algebras

29969: misc

29970: arith

29971: categories

29972: stats

29973: sets

29974: combinat

29975: numerical, probability

29976: matrix

29977: modular

29978: modules

29979: rings

29980: crypto

29981: documentation

29982: dynamics

29983: finance

29984: symbolic

29985: schemes

29986: plot

32188: various missed tests along the way

32216: Update the developers guide for implicitly fuzzing doctests

32544: Meta-ticket: Fix unstable doctests detected after #29935

29936: Hyperbolic geometry bugs revealed by fuzzing

29945: Failing doctest in src/sage/coding/linear_code.py

29954: Unstable plotting

29956: Bug in KlyachkoBundle_class.random_deformation

29957: Bug in ContinuedFraction rounding

29958: Too many strong articulation points

29961: Random symbolic expression is completely unstable

30045: Bug in Reed-Solomon encoder and error-erasure decoder

31890: simplify_hypergeometric is unstable

31891: ZeroDivisonError when creating polynomial system

31892: Conic parametrization broken

32075: Polynomial generic power trunk broken

32083: Various errors with polybori including segmentation fault

32084 _nth_root_naive fails for integer mod

32085: Errors when computing norms of padic elements

32086: apply_homography unstable for continue fraction

32095: DiFUB algorithm fails on some random graph

32108: Fix random tree on one or less vertices

32109: Fix 0/0 in ore function field

32111: Unstable minimal polynomial for element of 2-adic Eisenstein Extension Field in pi defined by x^4 - 2*a

32117: Random relative number field checks only irreducibility over QQ

32118: AlgebraicForm checks invariance with random matrix that can be the identity

32124: SL2Z.random_element unstable

32125: random Ore polynomials do not respect minimum degree bound

32126: padic QpLC.random_element is broken

32127: gosper_iterator of continued fractions is unstable

32129: sage_input is unreliable for elements of ComplexField

32131: Cut width of graph with one edge incorrect

32132: Wrong gyration orbit length

32138: is_groebner fails over fraction fields

32141: Unstable doctest involving permutation groups

32169: Bug in edge disjoint spanning trees

32185: Failing weak order assertion on random symbolic expression

32186: Random bounded tolerance graph

32187: permutation group generated by list perms in L of degree n incorrect when compared to GAP

32657: plot_vector_field unstable

29945: Failing doctest in `src/sage/coding/linear_code.py`

29956: Bug in `KlyachkoBundle_class.random_deformation`

29957: Bug in `ContinuedFraction` rounding

31891: `ZeroDivisonError` when creating polynomial system

32084 `_nth_root_naive` fails for integer mod

32657: `plot_vector_field` unstable