Segfault using pytest on large sympy test file

oscarbenjamin commented 5 years ago

I'm trying to use pytest with sympy but getting a segfault. Since both pytest and sympy are pure Python I guess this is a bug in CPython. I'm finding it difficult to debug through pytest to isolate the underlying bug though.

I'm getting this segfault with Python 3.6 on OSX. I had something similar on Linux but I've now isolated it more precisely on OSX and haven't tested these exact steps on Linux since.

The environment is

$ pip install mpmath pytest pytest-faulthandler
$ pip list
Package             Version
------------------- -------
atomicwrites        1.2.1  
attrs               18.2.0 
more-itertools      4.3.0  
mpmath              1.0.0  
pip                 18.1   
pluggy              0.8.0  
py                  1.7.0  
pytest              4.0.0  
pytest-faulthandler 1.5.0  
setuptools          28.8.0 
six                 1.11.0

Now clone sympy and run pytest on test_trinomials.py. This file is 1.4MB and very repetitive. You can see it here. It takes some time before you see the segfault.

$ git clone https://github.com/sympy/sympy.git
$ cd sympy/
$ pytest sympy/integrals/rubi/rubi_tests/tests/test_trinomials.py 
==================================================================== test session starts =====================================================================
platform darwin -- Python 3.6.2, pytest-4.0.0, py-1.7.0, pluggy-0.8.0
architecture: 64-bit
cache:        yes
ground types: python 

rootdir: /Users/enojb/current/pytest_bug/sympy, inifile:
plugins: faulthandler-1.5.0
collecting ... Fatal Python error: Segmentation fault

Current thread 0x00007fffb3f64380 (most recent call first):
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/assertion/rewrite.py", line 412 in _rewrite_test
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/assertion/rewrite.py", line 162 in find_module
  File "<frozen importlib._bootstrap>", line 855 in _find_spec_legacy
  File "<frozen importlib._bootstrap>", line 881 in _find_spec
  File "<frozen importlib._bootstrap>", line 946 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 961 in _find_and_load
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/py/_path/local.py", line 668 in pyimport
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/python.py", line 450 in _importtestmodule
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/python.py", line 440 in _getobj
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/python.py", line 248 in fget
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/fixtures.py", line 1301 in parsefactories
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/python.py", line 443 in collect
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/runner.py", line 272 in <lambda>
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/runner.py", line 211 in __init__
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/runner.py", line 272 in pytest_make_collect_report
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/manager.py", line 61 in <lambda>
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/manager.py", line 67 in _hookexec
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/hooks.py", line 284 in __call__
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/runner.py", line 390 in collect_one_node
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/main.py", line 713 in genitems
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/main.py", line 486 in _perform_collect
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/main.py", line 447 in perform_collect
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/main.py", line 234 in pytest_collection
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/manager.py", line 61 in <lambda>
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/manager.py", line 67 in _hookexec
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/hooks.py", line 284 in __call__
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/main.py", line 224 in _main
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/main.py", line 185 in wrap_session
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/main.py", line 218 in pytest_cmdline_main
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/manager.py", line 61 in <lambda>
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/manager.py", line 67 in _hookexec
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/pluggy/hooks.py", line 284 in __call__
  File "/Users/enojb/current/pytest_bug/venv/lib/python3.6/site-packages/_pytest/config/__init__.py", line 77 in main
  File "/Users/enojb/current/pytest_bug/venv/bin/pytest", line 11 in <module>
Segmentation fault: 11

asottile commented 5 years ago

I'm able to reproduce this on windows. While watching this, it appears to take 3GB of ram before segfaulting:

$ pytest sympy/integrals/rubi/rubi_tests/tests/test_trinomials.py 
============================= test session starts ==============================
platform linux -- Python 3.6.6, pytest-4.0.0, py-1.7.0, pluggy-0.8.0
architecture: 64-bit
cache:        yes
ground types: python 

rootdir: /tmp/sympy, inifile:
collecting ... Segmentation fault (core dumped)

analyzing the core dump (without a debug binary, spinning up docker right now to use a debug build) it appears to be a stack overflow.

... (170k identical frames removed)
#173931 0x0000000000506ac3 in ?? ()
#173932 0x0000000000506ac3 in ?? ()
#173933 0x0000000000506ac3 in ?? ()
#173934 0x0000000000506ac3 in ?? ()
#173935 0x0000000000506ac3 in ?? ()
#173936 0x0000000000506ac3 in ?? ()
#173937 0x0000000000506ac3 in ?? ()
#173938 0x0000000000506ac3 in ?? ()
#173939 0x0000000000506ac3 in ?? ()
#173940 0x0000000000506ac3 in ?? ()
#173941 0x0000000000506ac3 in ?? ()
#173942 0x0000000000506ac3 in ?? ()
#173943 0x0000000000506ac3 in ?? ()
#173944 0x0000000000506ac3 in ?? ()
#173945 0x0000000000506ac3 in ?? ()
#173946 0x0000000000506ac3 in ?? ()
#173947 0x0000000000506ac3 in ?? ()
#173948 0x0000000000506ac3 in ?? ()
#173949 0x0000000000508c4f in ?? ()
#173950 0x000000000050d16d in ?? ()
#173951 0x000000000050ffaf in ?? ()
#173952 0x00000000005fac2b in PyAST_CompileObject ()
#173953 0x0000000000639ce5 in ?? ()
#173954 0x000000000051190a in ?? ()
#173955 0x00000000004f6070 in _PyEval_EvalFrameDefault ()
#173956 0x0000000000510c78 in ?? ()
#173957 0x00000000005119bd in ?? ()
#173958 0x00000000004f5277 in _PyEval_EvalFrameDefault ()
... uninteresting frames removed

This doesn't reproduce with --assert=plain so it is definitely ticked / compounded by the assertion rewriting:

$ python -m pytest --assert=plain sympy/integrals/rubi/rubi_tests/tests/test_trinomials.py 
============================= test session starts ==============================
platform linux -- Python 3.6.6, pytest-4.0.0, py-1.7.0, pluggy-0.8.0
architecture: 64-bit
cache:        yes
ground types: python 

rootdir: /tmp/sympy, inifile:
collected 5 items                                                              

sympy/integrals/rubi/rubi_tests/tests/test_trinomials.py sssss           [100%]

========================== 5 skipped in 0.70 seconds ===========================

asottile commented 5 years ago

with python3-dbg I get a backtrace that looks like:

(gdb) bt
#0  0x000000000053a4d1 in dfs (c=c@entry=0x7fffffff7ec0, b=0x7fff1104ea40, 
    a=a@entry=0x7fffffff7d70) at ../Python/compile.c:4787
#1  0x000000000053a4f7 in dfs (c=c@entry=0x7fffffff7ec0, b=0x7fff1104e8b0, 
    a=a@entry=0x7fffffff7d70) at ../Python/compile.c:4791
#2  0x000000000053a4f7 in dfs (c=c@entry=0x7fffffff7ec0, b=0x7fff1104e900, 
    a=a@entry=0x7fffffff7d70) at ../Python/compile.c:4791
#3  0x000000000053a4f7 in dfs (c=c@entry=0x7fffffff7ec0, b=0x7fff1104e950, 
    a=a@entry=0x7fffffff7d70) at ../Python/compile.c:4791
#4  0x000000000053a4f7 in dfs (c=c@entry=0x7fffffff7ec0, b=0x7fff1104e7c0, 
    a=a@entry=0x7fffffff7d70) at ../Python/compile.c:4791
#5  0x000000000053a4f7 in dfs (c=c@entry=0x7fffffff7ec0, b=0x7fff1104e810, 
    a=a@entry=0x7fffffff7d70) at ../Python/compile.c:4791
#6  0x000000000053a4f7 in dfs (c=c@entry=0x7fffffff7ec0, b=0x7fff1104e860, 
    a=a@entry=0x7fffffff7d70) at ../Python/compile.c:4791
#7  0x000000000053a4f7 in dfs (c=c@entry=0x7fffffff7ec0, b=0x7fff1104e6d0, 
    a=a@entry=0x7fffffff7d70) at ../Python/compile.c:4791
#8  0x000000000053a4f7 in dfs (c=c@entry=0x7fffffff7ec0, b=0x7fff1104e720, 
    a=a@entry=0x7fffffff7d70) at ../Python/compile.c:4791

which points at this: https://github.com/python/cpython/blob/4cf1f54eb764f856fda5a394d8598c0c90d69d77/Python/compile.c#L4791

oscarbenjamin commented 5 years ago

Okay, so what does that mean? Is this a bug in pytest, cpython or both?

If you can see a simple way to reproduce this without using pytest I'll take that as a bug report to cpython.

(Of course the test_trinomials.py file itself is questionable but that doesn't let pytest/cpython off the hook for seg-faulting.)

RonnyPfannschmidt commented 5 years ago

its possibly an interaction issue where the ast we generate mismatches a potentially hidden/changed expectation of cpython - more investigation is needed

asottile commented 5 years ago

The ~2891 line file expands to ~1029102 (yes million) lines when assert-rewritten.

For example, this (relative to the rest of the file) small assert:

assert rubi_test(rubi_integrate(x**m*(b*x**S(2) + c*x**S(4)), x), x, b*x**(m + S(3))/(m + S(3)) + c*x**(m + S(5))/(m + S(5)), expand=True, _diff=True, _numerical=True)

is rewritten to this (ignore the invalid variable names, since pytest writes directly to the ast the names don't actually matter -- the @py_ prefix is used to avoid clashing with user space variables):

    @py_assert4 = (x ** m)
    @py_assert8 = 2
    @py_assert10 = S(@py_assert8)
    @py_assert12 = (x ** @py_assert10)
    @py_assert13 = (b * @py_assert12)
    @py_assert17 = 4
    @py_assert19 = S(@py_assert17)
    @py_assert21 = (x ** @py_assert19)
    @py_assert22 = (c * @py_assert21)
    @py_assert23 = (@py_assert13 + @py_assert22)
    @py_assert24 = (@py_assert4 * @py_assert23)
    @py_assert26 = rubi_integrate(@py_assert24, x)
    @py_assert33 = 3
    @py_assert35 = S(@py_assert33)
    @py_assert37 = (m + @py_assert35)
    @py_assert38 = (x ** @py_assert37)
    @py_assert39 = (b * @py_assert38)
    @py_assert42 = 3
    @py_assert44 = S(@py_assert42)
    @py_assert46 = (m + @py_assert44)
    @py_assert47 = (@py_assert39 / @py_assert46)
    @py_assert52 = 5
    @py_assert54 = S(@py_assert52)
    @py_assert56 = (m + @py_assert54)
    @py_assert57 = (x ** @py_assert56)
    @py_assert58 = (c * @py_assert57)
    @py_assert61 = 5
    @py_assert63 = S(@py_assert61)
    @py_assert65 = (m + @py_assert63)
    @py_assert66 = (@py_assert58 / @py_assert65)
    @py_assert67 = (@py_assert47 + @py_assert66)
    @py_assert68 = True
    @py_assert70 = True
    @py_assert72 = True
    @py_assert74 = rubi_test(@py_assert26, x, @py_assert67, expand=@py_assert68, _diff=@py_assert70, _numerical=@py_assert72)
    if (not @py_assert74):
        @py_format76 = (('' + 'assert %(py75)s\n{%(py75)s = %(py0)s(%(py27)s\n{%(py27)s = %(py1)s(((%(py2)s ** %(py3)s) * ((%(py5)s * (%(py6)s ** %(py11)s\n{%(py11)s = %(py7)s(%(py9)s)\n})) + (%(py14)s * (%(py15)s ** %(py20)s\n{%(py20)s = %(py16)s(%(py18)s)\n})))), %(py25)s)\n}, %(py28)s, (((%(py29)s * (%(py30)s ** (%(py31)s + %(py36)s\n{%(py36)s = %(py32)s(%(py34)s)\n}))) / (%(py40)s + %(py45)s\n{%(py45)s = %(py41)s(%(py43)s)\n})) + ((%(py48)s * (%(py49)s ** (%(py50)s + %(py55)s\n{%(py55)s = %(py51)s(%(py53)s)\n}))) / (%(py59)s + %(py64)s\n{%(py64)s = %(py60)s(%(py62)s)\n}))), expand=%(py69)s, _diff=%(py71)s, _numerical=%(py73)s)\n}') % {
            'py0': (@pytest_ar._saferepr(rubi_test) if (('rubi_test' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(rubi_test)) else 'rubi_test'),
            'py1': (@pytest_ar._saferepr(rubi_integrate) if (('rubi_integrate' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(rubi_integrate)) else 'rubi_integrate'),
            'py2': (@pytest_ar._saferepr(x) if (('x' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(x)) else 'x'),
            'py3': (@pytest_ar._saferepr(m) if (('m' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(m)) else 'm'),
            'py5': (@pytest_ar._saferepr(b) if (('b' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(b)) else 'b'),
            'py6': (@pytest_ar._saferepr(x) if (('x' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(x)) else 'x'),
            'py7': (@pytest_ar._saferepr(S) if (('S' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(S)) else 'S'),
            'py9': @pytest_ar._saferepr(@py_assert8),
            'py11': @pytest_ar._saferepr(@py_assert10),
            'py14': (@pytest_ar._saferepr(c) if (('c' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(c)) else 'c'),
            'py15': (@pytest_ar._saferepr(x) if (('x' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(x)) else 'x'),
            'py16': (@pytest_ar._saferepr(S) if (('S' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(S)) else 'S'),
            'py18': @pytest_ar._saferepr(@py_assert17),
            'py20': @pytest_ar._saferepr(@py_assert19),
            'py25': (@pytest_ar._saferepr(x) if (('x' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(x)) else 'x'),
            'py27': @pytest_ar._saferepr(@py_assert26),
            'py28': (@pytest_ar._saferepr(x) if (('x' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(x)) else 'x'),
            'py29': (@pytest_ar._saferepr(b) if (('b' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(b)) else 'b'),
            'py30': (@pytest_ar._saferepr(x) if (('x' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(x)) else 'x'),
            'py31': (@pytest_ar._saferepr(m) if (('m' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(m)) else 'm'),
            'py32': (@pytest_ar._saferepr(S) if (('S' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(S)) else 'S'),
            'py34': @pytest_ar._saferepr(@py_assert33),
            'py36': @pytest_ar._saferepr(@py_assert35),
            'py40': (@pytest_ar._saferepr(m) if (('m' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(m)) else 'm'),
            'py41': (@pytest_ar._saferepr(S) if (('S' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(S)) else 'S'),
            'py43': @pytest_ar._saferepr(@py_assert42),
            'py45': @pytest_ar._saferepr(@py_assert44),
            'py48': (@pytest_ar._saferepr(c) if (('c' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(c)) else 'c'),
            'py49': (@pytest_ar._saferepr(x) if (('x' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(x)) else 'x'),
            'py50': (@pytest_ar._saferepr(m) if (('m' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(m)) else 'm'),
            'py51': (@pytest_ar._saferepr(S) if (('S' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(S)) else 'S'),
            'py53': @pytest_ar._saferepr(@py_assert52),
            'py55': @pytest_ar._saferepr(@py_assert54),
            'py59': (@pytest_ar._saferepr(m) if (('m' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(m)) else 'm'),
            'py60': (@pytest_ar._saferepr(S) if (('S' in @py_builtins.locals()) or @pytest_ar._should_repr_global_name(S)) else 'S'),
            'py62': @pytest_ar._saferepr(@py_assert61),
            'py64': @pytest_ar._saferepr(@py_assert63),
            'py69': @pytest_ar._saferepr(@py_assert68),
            'py71': @pytest_ar._saferepr(@py_assert70),
            'py73': @pytest_ar._saferepr(@py_assert72),
            'py75': @pytest_ar._saferepr(@py_assert74),
        })
        raise AssertionError(@pytest_ar._format_explanation(@py_format76))

Some of the assertions cause pytest to produce 3500+ variables for the assertion message:

$ grep '^ *@py_assert35[0-9][0-9]' f.py | tail -1
    @py_assert3562 = rubi_test(@py_assert42, x, @py_assert3555, expand=@py_assert3556, _diff=@py_assert3558, _numerical=@py_assert3560)

I suspect that's just too much code for python to handle (especially in so few functions)

oscarbenjamin commented 5 years ago

Issue 31113 for cpython looks similar to this. Apparently a fix went into Python 3.7.1. I've just tested it with Python 3.7.1 and it works:

$ pytest sympy/integrals/rubi/rubi_tests/tests/test_trinomials.py 
==================================================================== test session starts =====================================================================
platform darwin -- Python 3.7.1, pytest-4.0.0, py-1.7.0, pluggy-0.8.0
architecture: 64-bit
cache:        yes
ground types: python 

rootdir: /Users/enojb/current/sympy/sympy, inifile:
collected 5 items                                                                                                                                            

sympy/integrals/rubi/rubi_tests/tests/test_trinomials.py sssss                                                                                         [100%]

================================================================= 5 skipped in 99.38 seconds =================================================================

So it looks as if CPython has improved its ability to handle large files. It's still possible that pytest shouldn't be generating them though.

asottile commented 5 years ago

code generation is how pytest makes pretty assertions, I don't think there's a way out of that. Since this is fixed in cpython, I'm going to close this.

Thanks for the issue!

oscarbenjamin commented 5 years ago

Well... when I say that it works, it does take a long time and consume a lot of memory just to collect these tests (and then skip them!).

I would expect other situations where this causes problems if pytest has no limits on the code it generates.

asottile commented 5 years ago

your example here is pretty exceptional: a test containing 3k lines of 20+ operation assertions 😆 -- I wouldn't expect anyone to hit this normally

RonnyPfannschmidt commented 5 years ago

we ought to consider thinking about proposing opt-out for files with high cost patterns like the one @oscarbenjamin has in the testsuite

a number of assertions per file treshold could help

@oscarbenjamin could you take a look at which parts of your file actually cause the issues (aka is it one special test with exorbitant complexity result for the assert statement, or is it she sheer number?

based on this assessment we can think of something to make sense of this

@nicoddemus @asottile i do wonder if it would be a sensible approach to switch from plain assertions to a metatool where the unaltered code of the assertion only is executed on top of a special namespace

this would reduce the complexity of the ast significantly, moving the cost to different places instead

oscarbenjamin commented 5 years ago

we ought to consider thinking about proposing opt-out for files with high cost patterns like the one @oscarbenjamin has in the testsuite

a number of assertions per file treshold could help

Is the rewriting done per function or per module? Not rewriting skipped tests would solve the problem for this particular module.

@oscarbenjamin could you take a look at which parts of your file actually cause the issues (aka is it one special test with exorbitant complexity result for the assert statement, or is it she sheer number?

based on this assessment we can think of something to make sense of this

Yeah, I can take a look. How can I see the output that pytest would generate from assert rewriting?

It would be nice if there was a command line flag for that:

$ pytest sympy/solvers/tests/test_ode.py -k test_classify_ode --show-assert-rewrite

@nicoddemus @asottile i do wonder if it would be a sensible approach to switch from plain assertions to a metatool where the unaltered code of the assertion only is executed on top of a special namespace

this would reduce the complexity of the ast significantly, moving the cost to different places instead

It's probably worth considering the cost of assertion rewriting in less pathological cases as well as the example in this issue. A more typical test run for me would be:

$ git clone https://github.com/sympy/sympy.git
$ cd sympy/
$ pytest -m 'not slow' sympy/solvers/tests/test_ode.py

With that I get:

$ pytest -m 'not slow' sympy/solvers/tests/test_ode.py --assert=plain # 329 seconds
$ pytest -m 'not slow' sympy/solvers/tests/test_ode.py  # 334 seconds

The additional time is the same when running tests in parallel with pytest-xdist

$ pytest -m 'not slow' sympy/solvers/tests/test_ode.py -n 3 --assert=plain # 193 seconds
$ pytest -m 'not slow' sympy/solvers/tests/test_ode.py -n 3 # 206 seconds

So it looks as if assert-rewriting test_ode.py takes 15 seconds and can't be parallelised. Hopefully I will make the tested code more efficient in which case that extra cost would become relatively more significant. I need to improve the test coverage first though :).

oscarbenjamin commented 5 years ago

It's also worth noting that if assert-rewriting slows every tested module then when running a large test suite to see a (hopefully!) small number of fails it would be quicker to run without assert rewriting and then to rerun only the failed tests with rewriting at the end.

asottile commented 5 years ago

xdist likely compounds this as it discovers multiple times in each worker -- given your linear discovery overhead the rewrite seems to (only) take ~5 seconds. looking at that file it's not surprising why -- it has similar complex patterns to the problem-file from before (lots of assertion expressions with high numbers of locals and operations in the assertion)

assertion rewriting is done per module -- this is the most basic unit of the python import system (and the cacheable unit via pyc files)

I displayed the code by applying a patch similar to this one

I'm not convinced normal test suites see this extreme level of overhead.

On pytest's testsuite the total overhead is ~3 seconds (I took best-of-5 for both measurements below):

$ git clean -fxfd testing >& /dev/null && time pytest testing/ --collect-only >& /dev/null

real    0m4.965s
user    0m4.654s
sys 0m0.084s

$ git clean -fxfd testing >& /dev/null && time pytest testing/ --collect-only --assert=plain >& /dev/null

real    0m2.122s
user    0m2.020s
sys 0m0.087s

pytest has ~4000 assertions

On pre-commit's testsuite (the largest I "own": ~550 asserts) the overhead is .5s (1.5s vs 1.0s).

RonnyPfannschmidt commented 5 years ago

collection coordination for xdist is a desired feature simply because right now the import system costs are problematic for some types of project (i believe cryptography is affected as well CC @alex )

oscarbenjamin commented 5 years ago

I reran my timings for

$ pytest -m 'not slow' sympy/solvers/tests/test_ode.py --assert=plain
$ pytest -m 'not slow' sympy/solvers/tests/test_ode.py

and after 5 runs of each I don't see a significant difference. Inter-run variability is bigger than the difference between using and not using --assert=plain. So I think the timings I referred to above are a red herring...

I will still take a look at the issue with test_trinomials.py in more detail later. I think that what is happening (apart from the file being large) is that particularly complex assert expressions produce large rewritten output but the extra output is probably not useful past a certain point. I'll work through it a bit and open a separate issue since this one is resolved.

asottile commented 5 years ago

@oscarbenjamin note that I clean pyc files, which is why you're probably not seeing a difference now. I imagine they'll come back if you clean those out.

oscarbenjamin commented 5 years ago

If I switch between using / not using --assert=plain do I need to delete the .pyc files for that setting to take effect?

asottile commented 5 years ago

The commandline parameter will take immediately, but you'll only see the rewrite overhead once as it gets cached into pyc files.

alex commented 5 years ago

@RonnyPfannschmidt as of the last time we checked (though it's been a while; cc: @reaperhulk), that's right -- despite having 100k tests, the cryptography test suite doesn't get a benefit from xdist because of this.

reaperhulk commented 5 years ago

Our issues with pytest-xdist are more related to memory than anything else. Test collection (on an i7-8850H) for cryptography takes ~20s and without assert rewrite it's ~19s, but our test suite takes 5-10 minutes so parallelization would be a win even with the unfortunate overhead of duplicate collection. Unfortunately, Python uses nearly 700MB of RAM per process just collecting the tests (and at test completion each process is ~1GB when we do n=4). Our CI system can't afford that kind of memory consumption.

oscarbenjamin commented 5 years ago

Do you get the same memory overhead with --assert=plain?

reaperhulk commented 5 years ago

Yes, asset rewriting has negligible memory impact (at least in our specific case).

RonnyPfannschmidt commented 5 years ago

thanks for the inputs, we will have to open 2-3 targeted followup issues

pytest-dev / pytest

Segfault using pytest on large sympy test file #4406