flintlib / python-flint

Python bindings for Flint and Arb
MIT License
131 stars 26 forks source link

doctests for `flint.types.arb` take a very long time #184

Closed GiacomoPope closed 1 month ago

GiacomoPope commented 2 months ago

They're about 99% of all the testing time for me, maybe we should modify them?

oscarbenjamin commented 2 months ago

I think that there are just one or two doctests that take almost all of the time. We could mark them to be skipped.

I often run the tests like python -m flint.test -t which skips all the doctests. Generally I want to get the coverage from normal tests: doctests are just for testing the docs.

This takes under a second:

% time spin run -- python -m flint.test -tq
$ meson compile -C build
$ meson install --only-changed -C build --destdir ../build-install
Running tests...
flint.test: all 52 tests passed!
----------------------------------------
OK: Your installation of python-flint seems to be working just fine!
----------------------------------------
spin run -- python -m flint.test -tq  0.56s user 0.11s system 78% cpu 0.855 total
GiacomoPope commented 2 months ago

Yeah I also generally run with --test to avoid the doctests but I just thought I'd flag this as I think there's just a few slow functions which we could maybe speed up by using different params

oscarbenjamin commented 2 months ago

I'm sure that the slow tests could be made faster by using different parameters but I also imagine that the parameters were chosen deliberately to demonstrate that arb can solve some difficult problem.

This is why I don't like using doctests as main tests. Inevitably you end up with this tension where the examples that are useful in the documentation are not good choices for the test suite or vice versa. In this case the issue is that it makes sense to demonstrate a slow thing in the docs but we probably don't get much value from running that slow thing in the standard test suite every time. In other cases we will want to test edge cases but it usually doesn't make sense to have docs that are full of examples showing trivial output for edge cases.

Generally it is better if we keep a clear distinction between main tests and doctests. The purpose of the main tests is to test that the code works. The purpose of the docs is to communicate things to users. The purpose of the doctests is to test that the docs are correct which helps ensure that the docs are useful to users.

GiacomoPope commented 2 months ago

Okay, I'm happy to close this then if you are :)

oscarbenjamin commented 2 months ago

Another problem here is that actually every doctest gets run twice. I'm not sure what causes that...

oscarbenjamin commented 2 months ago

every doctest gets run twice

Actually I'm not sure about this any more...

In any case this is the slow test. If we skip it then the time to rebuild (if code not changed), run the all tests and doctests with spin run python -m flint.test goes from 100 seconds to 7 seconds on this machine:

diff --git a/src/flint/types/arb.pyx b/src/flint/types/arb.pyx
index 3a28a4e..8a17058 100644
--- a/src/flint/types/arb.pyx
+++ b/src/flint/types/arb.pyx
@@ -2239,7 +2239,7 @@ cdef class arb(flint_scalar):
             >>> from flint import showgood
             >>> showgood(lambda: arb("9/10").hypgeom_2f1(arb(2).sqrt(), 0.5, arb(2).sqrt()+1.5, abc=True), dps=25)
             1.447530478120770807945697
-            >>> showgood(lambda: arb("9/10").hypgeom_2f1(arb(2).sqrt(), 0.5, arb(2).sqrt()+1.5), dps=25)
+            >>> showgood(lambda: arb("9/10").hypgeom_2f1(arb(2).sqrt(), 0.5, arb(2).sqrt()+1.5), dps=25) # doctest: +SKIP
             Traceback (most recent call last):
               ...
             ValueError: no convergence (maxprec=960, try higher maxprec)
GiacomoPope commented 2 months ago

every doctest gets run twice

I got really confused about this too, i think it's run once but for some reason the error is printed twice?

Jake-Moss commented 1 month ago

Whoops wasn't aware of this until now. This is now fixed by https://github.com/flintlib/python-flint/pull/216 (definition of scope creep but oh well).

Doctest was finding and running each doc test in module only once, each doc test just happened to be duplicated in cython __test__ dictionary as well as on the embedded docstring. It also contains a commit to skip the offending, slow running arb.hypgeom_2f1 convergence doc test. This brings by total test time down run 21s to 2.5s and interoperates the doctests into the main pytest running suite. Doctest modules no longer need to be manually defined

oscarbenjamin commented 1 month ago

This is now fixed. Thanks!