Closed h-vetinari closed 2 years ago
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found it was in an excellent condition.
I think the most urgent at the moment is fixing pypy on windows. That contains by far the most failures (which aren't QEMU-related as on PPC). CC @mattip
Regarding PPC, I'm thinking if switching to cos7 (hopefully getting a less buggy QEMU) would make sense...
Regarding PPC, I'm thinking if switching to cos7 (hopefully getting a less buggy QEMU) would make sense...
I wasn't thinking straight, PPC is on cos7 already. Also, QEMU isn't coming through the OS, but through the docker image...
@isuruf Before I upset devs by raising an issue on the wrong repo - if I wanted to raise an issue about the PPC failures here, would https://github.com/multiarch/qemu-user-static be the right place to raise it?
Here are the more detailed errors for the two failures on unix+pypy:
=================================== FAILURES ===================================
________________________ TestCdist.test_cdist_refcount _________________________
[gw2] darwin -- Python 3.7.12 $PREFIX/bin/python
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/site-packages/scipy/spatial/tests/test_distance.py:673: in test_cdist_refcount
assert all(weak_ref() is None for weak_ref in weak_refs)
E assert False
E + where False = all(<generator object TestCdist.test_cdist_refcount.<locals>.<genexpr> at 0x00007fd5d9781260>)
kwargs = {}
metric = 'braycurtis'
self = <scipy.spatial.tests.test_distance.TestCdist object at 0x00007fd5f8541bb0>
sup = <numpy.testing._private.utils.suppress_warnings object at 0x00007fd5f8541c90>
weak_refs = [<weakref at 0x00007fd5d3e89360; dead>, <weakref at 0x00007fd5d3e89380; dead>, <weakref at 0x00007fd5d3e893a0; dead>]
_____________________ TestBeta.test_boost_eval_issue_14606 _____________________
[gw1] darwin -- Python 3.7.12 $PREFIX/bin/python
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/site-packages/scipy/stats/tests/test_distributions.py:2904: in test_boost_eval_issue_14606
stats.beta.ppf(q, a, b)
E Failed: DID NOT WARN. No warnings of type (<class 'RuntimeWarning'>,) was emitted. The list of emitted warnings is: [UserWarning('Error in function boost::math::tgamma<d>(d,d): Series evaluation exceeded %1% iterations, giving up now.')].
a = 100000000000.0
b = 10000000000000.0
q = 0.995
self = <scipy.stats.tests.test_distributions.TestBeta object at 0x00007f88f4c9f0c0>
Test sources:
Here's the output of the one test failure on aarch+cpython:
=================================== FAILURES ===================================
_________________________ TestF77Mismatch.test_lapack __________________________
[gw1] linux -- Python 3.9.7 $PREFIX/bin/python
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.9/site-packages/scipy/linalg/tests/test_build.py:50: in test_lapack
deps = f.grep_dependencies(flapack.__file__,
f = <scipy.linalg.tests.test_build.FindDependenciesLdd object at 0x56782eb100>
self = <scipy.linalg.tests.test_build.TestF77Mismatch object at 0x56782eb760>
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.9/site-packages/scipy/linalg/tests/test_build.py:32: in grep_dependencies
stdout = self.get_dependencies(file)
deps = ['libg2c', 'libgfortran']
file = '/home/conda/feedstock_root/build_artifacts/scipy_1636463706185/_test_env_placehold_placehold_placehold_placehold_plac...ehold_placehold_placehold_placehold_/lib/python3.9/site-packages/scipy/linalg/_flapack.cpython-39-aarch64-linux-gnu.so'
self = <scipy.linalg.tests.test_build.FindDependenciesLdd object at 0x56782eb100>
../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.9/site-packages/scipy/linalg/tests/test_build.py:27: in get_dependencies
raise RuntimeError("Failed to check dependencies for %s" % file)
E RuntimeError: Failed to check dependencies for $PREFIX/lib/python3.9/site-packages/scipy/linalg/_flapack.cpython-39-aarch64-linux-gnu.so
file = '/home/conda/feedstock_root/build_artifacts/scipy_1636463706185/_test_env_placehold_placehold_placehold_placehold_plac...ehold_placehold_placehold_placehold_/lib/python3.9/site-packages/scipy/linalg/_flapack.cpython-39-aarch64-linux-gnu.so'
p = <Popen: returncode: 1 args: ['ldd', '/home/conda/feedstock_root/build_artifa...>
self = <scipy.linalg.tests.test_build.FindDependenciesLdd object at 0x56782eb100>
stderr = b''
stdout = b'\tnot a dynamic executable\n'
PyPy on aarch got killed for some reason...
Before I upset devs by raising an issue on the wrong repo - if I wanted to raise an issue about the PPC failures here, would https://github.com/multiarch/qemu-user-static be the right place to raise it?
No, that's where the static binaries are compiled. Btw, I'm not sure if it's qemu fault or an openblas fault with qemu or something else.
For win+pypy, the errors are pretty bad - i.e. the function calls themselves succeed but the results are numerical garbage; as an example:
E AssertionError:
E Arrays are not almost equal to 6 decimals
E
E Mismatched elements: 4 / 4 (100%)
E Max absolute difference: 1.
E Max relative difference: 1.
E x: array([2., 0., 2., 0.])
E y: array([1., 1., 1., 1.])
Notably, the large majority of the 369 errors happen in the stats
module:
Btw, I'm not sure if it's qemu fault or an openblas fault with qemu or something else.
Just to be sure I understand correctly - you said the errors don't happen on native hardware (which matches that the tests were passing on Travis). What else other than the emulation could be the cause then?
Or do you mean it's an interaction issue between QEMU & openblas (or similar)?
Here's the output of the one test failure on aarch+cpython:
Thanks, looks like that needs skipping. This page may have a relevant explanation: " ldd reports incorrectly "not a dynamic executable" when the executable's loader is not present". We're cross-compiling here and running the tests under QEMU, so invoking ldd
in a subprocess isn't going to give the right result I guess.
Or do you mean it's an interaction issue between QEMU & openblas (or similar)?
Probably. Can you try netlib instead of openblas?
So, I had already cut down the test suite on PPC to a pitiful three smallish modules (as most remaining errors had been coming from fft),
================== 3 failed, 155 passed, 4 skipped in 36.08s ===================
But even with netlib, the same failures (as with openblas) remain. And it's not something minor like bad accuracy; at least with the error below, it's (1st & 2nd) columns getting switched
E AssertionError:
E Not equal to tolerance rtol=1e-07, atol=1e-10
E
E Mismatched elements: 6 / 20 (30%)
E Max absolute difference: 7.
E Max relative difference: 1.5
E x: array([[ 2., 5., 138., 2.],
E [ 3., 4., 219., 2.],
E [ 0., 7., 255., 3.],...
E y: array([[ 5., 2., 138., 2.],
E [ 4., 3., 219., 2.],
E [ 7., 0., 255., 3.],...
Here's the output of the one test failure on aarch+cpython:
Thanks, looks like that needs skipping.
Removed it completely instead, it is no longer needed: https://github.com/scipy/scipy/pull/15010
Can you try netlib instead of openblas?
This didn't bring any change either, unfortunately - even on netlib, the errors remain. And switched columns (see above) is pretty substantial... Assuming there's no other component left (unless it's a shared code path between netlib & openblas), we should probably raise an issue with QEMU?
@mattip, would you have some cycles to dig into the pypy errors here? Is there a way I could help with that?
Unix:
FAILED spatial/tests/test_distance.py::TestCdist::test_cdist_refcount - asser...
FAILED stats/tests/test_distributions.py::TestBeta::test_boost_eval_issue_14606
= 2 failed, 32821 passed, 2093 skipped, 105 xfailed, 10 xpassed, 41 warnings in 1638.50s (0:27:18) =
Windows:
= 117 failed, 32228 passed, 2571 skipped, 104 xfailed, 11 xpassed, 142 warnings in 3485.47s (0:58:05) =
Sometimes the windows pipeline actually times out completely, being killed in a test where there might be some self-referential deadlock(?) in combination with a list comprehension:
File "C:\bld\scipy_1637807031363\_test_env\lib\site-packages\scipy\integrate\quadpack.py", line 463, in _quad
return _quadpack._qagse(func,a,b,args,full_output,epsabs,epsrel,limit)
File "C:\bld\scipy_1637807031363\_test_env\lib_pypy\_functools.py", line 80, in __call__
return self._func(*(self._args + fargs), **fkeywords)
File "C:\bld\scipy_1637807031363\_test_env\lib\site-packages\scipy\integrate\quadpack.py", line 874, in integrate
opt['points'] = [x for x in opt['points'] if low <= x <= high]
File "C:\bld\scipy_1637807031363\_test_env\lib\site-packages\scipy\integrate\quadpack.py", line 874, in <listcomp>
opt['points'] = [x for x in opt['points'] if low <= x <= high]
+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++
integrate/tests/test_quadpack.py::TestNQuad::test_fixed_limits
FAILED stats/tests/test_distributions.py::TestBeta::test_boost_eval_issue_14606
That test is for converting a boostpython error into a warning. Apparently PyPy does not hit the code path to create the boost-python warning in the first place. I would suggest here too skipping the test on PyPy at least until SciPy expressly supports PyPy.
with pytest.warns(RuntimeWarning):
stats.beta.ppf(q, a, b)
I can try to figure out what is going on with windows + pypy + scipy.
Weird. The TestNQuad::test_fixed_limits
hangs on windows + pypy but takes about 1s on windows + cpython
Weird. The
TestNQuad::test_fixed_limits
hangs on windows + pypy but takes about 1s on windows + cpython
Have a look at the CI on master, there you get a detailed traceback after it gets killed in the timeout.
I don't know if this occurrence is flaky, it's also possible that the addition of -nauto
here prevents it from showing up in the log.
I wonder if the two failures are connected. One theory might be that for the same reason boost does not raise an error, the optimization test never converges/fails
Can we try disabling pythran on windows to see if that is better? Somehow this was all working for #180 with scipy 1.7.1
Can we try disabling pythran on windows to see if that is better?
Thanks for the PR!
Somehow this was all working for #180 with scipy 1.7.1
Well, without the sys.exit
-wrapper, we were potentially ignoring errors - checking the CI for #180 indeed shows that test_cdist_refcount
was already failing then, but nothing else. Which seems to indicate that the other ~110 errors come from building with pythran support...
Here is a summary of the latest failures. All the pypy tests crashed when starting tests. Maybe worth restarting tests? I don't remember seeing that elsewhere
run | failures |
---|---|
linux_64 pypy | crashed when starting tests |
linux_aarch64 cpython | 3 failures [0] |
linux_aarch64 pypy | crashed when starting tests |
linux_ppc64le cpython | 622 failures |
linux_ppc64le pypy | crashed when starting tests |
win_64 pypy | crashed when starting tests |
osx_64 pypy | crashed when starting tests |
Which seems to indicate that the other ~110 errors come from building with pythran support...
Welp, seems things have gotten worse when building without pythran now?!
= 247 failed, 32096 passed, 2571 skipped, 105 xfailed, 10 xpassed, 280 warnings in 3210.55s (0:53:30) =
I appealed for help to the scipy issue tracker, let's see if someone can come up with a theory as to what is going on.
The pypy stuff will need some more investigation once the migrator comes around, but for now I've rolled the changes from here into #201.
Follow up to #195 & #194.