python / cpython

The Python programming language
https://www.python.org
Other
63.35k stars 30.34k forks source link

Multiple test failures in GCC and Clang optional builds on Travis CI #80595

Closed tirkarthi closed 2 years ago

tirkarthi commented 5 years ago
BPO 36414
Nosy @vstinner, @pablogsal, @tirkarthi

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.8', 'type-bug', 'tests'] title = 'Multiple test failures in GCC and Clang optional builds on Travis CI' updated_at = user = 'https://github.com/tirkarthi' ``` bugs.python.org fields: ```python activity = actor = 'xdegaye' assignee = 'none' closed = False closed_date = None closer = None components = ['Tests'] creation = creator = 'xtreak' dependencies = [] files = [] hgrepos = [] issue_num = 36414 keywords = [] message_count = 10.0 messages = ['338725', '338737', '338876', '338877', '338879', '338881', '339641', '339789', '342352', '343712'] nosy_count = 3.0 nosy_names = ['vstinner', 'pablogsal', 'xtreak'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue36414' versions = ['Python 3.8'] ```

tirkarthi commented 5 years ago

I am not able to reproduce the errors on GCC built CPython binary and running tests with virtualenv (no coverage). Seems the dangling thread error takes up the whole 50 minutes time limit. Since GCC build is not maintained or tracked is it worth stopping it on Travis since this wastes a lot of build minutes. Clang on Mac optional build never starts running the tests too.

Reference build failures :

https://travis-ci.org/python/cpython/jobs/510447289 https://travis-ci.org/python/cpython/jobs/510447290

tirkarthi commented 5 years ago

Possibly first occurrence of this error : https://travis-ci.org/python/cpython/jobs/506783665 after which it's more or less consistent. Almost all the builds I checked before this build did not have this failure. The commit for the build seems to be unrelated but just in case : https://github.com/python/cpython/commit/86082c22d23285995a32aabb491527c9f5629556

vstinner commented 5 years ago

https://travis-ci.org/python/cpython/jobs/510447289

This failure is on the master branch.

./python.exe  ./Tools/scripts/run_tests.py -j 1 -u all -W --slowest --fail-env-changed --timeout=1200 -j4 -uall,-cpu
ERROR:root:code for hash md5 was not found.
Traceback (most recent call last):
  File "/Users/travis/build/python/cpython/Lib/hashlib.py", line 244, in <module>
    globals()[__func_name] = __get_hash(__func_name)
  File "/Users/travis/build/python/cpython/Lib/hashlib.py", line 113, in __get_builtin_constructor
    raise ValueError('unsupported hash type ' + name)
ValueError: unsupported hash type md5
(...)

Travis CI config has been changed to use a more recent Ubuntu version, it can explain the failure.

commit 74ae50e53e59bbe39d6287b902757f0cd01327dc Author: CAM Gerlach \CAM.Gerlach@Gerlach.CAM\ Date: Mon Mar 18 05:44:58 2019 -0500

bpo-36307: Travis: upgrade to Xenial environment (GH-12356)
vstinner commented 5 years ago

https://travis-ci.org/python/cpython/jobs/510447290

That's a run on the master branch ("CRON").

xvfb-run ./venv/bin/python -m coverage run --pylib -m test --fail-env-changed -uall,-cpu -x test_multiprocessing_fork -x test_multiprocessing_forkserver -x test_multiprocessing_spawn -x test_concurrent_futures
== CPython 3.8.0a2+ (heads/master:a7987e7, Mar 23 2019, 23:53:10) [GCC 5.4.0 20160609]
== Linux-4.15.0-1028-gcp-x86_64-with-glibc2.17 little-endian
== cwd: /home/travis/build/python/cpython/build/test_python_26699
== CPU count: 2
== encodings: locale=UTF-8, FS=utf-8
Run tests sequentially
0:00:00 load avg: 1.49 [  1/416] test_grammar
0:00:03 load avg: 1.45 [  2/416] test_opcodes
0:00:03 load avg: 1.45 [  3/416] test_dict
0:00:12 load avg: 1.41 [  4/416] test_builtin
0:00:18 load avg: 1.35 [  5/416] test_exceptions
Exception ignored in: <function ExceptionTests.test_unraisable.<locals>.BrokenDel.__del__ at 0x7f9e77198dc0>
Traceback (most recent call last):
  File "/home/travis/build/python/cpython/Lib/test/test_exceptions.py", line 1182, in __del__
    raise exc
ValueError: del is broken
Exception ignored in: <function ExceptionTests.test_unraisable.<locals>.BrokenExceptionDel.__del__ at 0x7f9e771988c0>
Traceback (most recent call last):
  File "/home/travis/build/python/cpython/Lib/test/test_exceptions.py", line 1188, in __del__
    raise exc
test.test_exceptions.BrokenStrException: <exception str() failed>
test test_exceptions failed -- multiple errors occurred; run in verbose mode for details
0:00:22 load avg: 1.35 [  6/416/1] test_types -- test_exceptions failed
test test_types failed -- Traceback (most recent call last):
  File "/home/travis/build/python/cpython/Lib/test/test_types.py", line 1433, in test_duck_gen
    self.assertIsInstance(gen, collections.abc.Generator)
AssertionError: <MagicMock spec='GenLike' id='140318583278608'> is not an instance of <class 'collections.abc.Generator'>

0:00:25 load avg: 1.32 [ 7/416/2] test_unittest -- test_types failed test test_unittest failed -- multiple errors occurred; run in verbose mode for details 0:01:33 load avg: 1.11 [ 8/416/3] test_doctest -- test_unittest failed in 1 min 7 sec 0:01:50 load avg: 1.08 [ 9/416/3] test_doctest2 0:01:50 load avg: 1.08 [ 10/416/3] test_support 0:02:11 load avg: 1.05 [ 11/416/3] test_all_ 0:02:31 load avg: 1.04 [ 12/416/3] testfuture 0:02:32 load avg: 1.04 [ 13/416/3] testlocale 0:02:32 load avg: 1.04 [ 14/416/3] testopcode 0:02:34 load avg: 1.03 [ 15/416/3] testosx_support 0:02:34 load avg: 1.03 [ 16/416/3] testxxsubinterpreters Warning -- threading._dangling was modified by testxxsubinterpreters Before: \<_weakrefset.WeakSet object at 0x7f9e7751a160> After: \<_weakrefset.WeakSet object at 0x7f9e752abb20> 0:02:50 load avg: 1.03 [ 17/416/4] test_abc -- test__xxsubinterpreters failed (env changed) 0:02:51 load avg: 1.03 [ 18/416/4] test_abstract_numbers 0:02:52 load avg: 1.03 [ 19/416/4] test_aifc 0:02:55 load avg: 1.02 [ 20/416/4] test_argparse 0:05:14 load avg: 1.02 [ 21/416/4] test_array -- test_argparse passed in 2 min 19 sec 0:05:38 load avg: 1.01 [ 22/416/4] test_asdl_parser 0:05:39 load avg: 1.01 [ 23/416/4] test_ast 0:05:51 load avg: 1.01 [ 24/416/4] test_asyncgen

0:05:54 load avg: 0.93 [ 25/416/4] test_asynchat Warning -- threading_cleanup() failed to cleanup 0 threads (count: 0, dangling: 2) Dangling thread: \<echo_server(Thread-11, stopped 140318458308352)> (...) Dangling thread: \<echo_server(Thread-43, stopped 140318458308352)> Dangling thread: \<_MainThread(MainThread, started 140318701401856)> Dangling thread: \<echo_server(Thread-19, stopped 140318458308352)> Warning -- threading._dangling was modified by test_asynchat Before: \<_weakrefset.WeakSet object at 0x7f9e70f061c0> After: \<_weakrefset.WeakSet object at 0x7f9e6fff7af0>

0:15:10 load avg: 0.96 [ 26/416/5] test_asyncio -- test_asynchat failed (env changed) in 9 min 15 sec Warning -- threading_cleanup() failed to cleanup 0 threads (count: 0, dangling: 2) Dangling thread: \<_MainThread(MainThread, started 140318701401856)> Dangling thread: \<Thread(ThreadPoolExecutor-0_0, stopped daemon 140318458308352)> Warning -- threading_cleanup() failed to cleanup 0 threads (count: 0, dangling: 6) Dangling thread: \<Thread(Thread-46, stopped 140318458308352)> Dangling thread: \<Thread(Thread-47, stopped 140318458308352)> Dangling thread: \<Thread(Thread-48, stopped 140318458308352)> (...) Dangling thread: \<Thread(ThreadPoolExecutor-8_6, stopped daemon 140318402094848)> Dangling thread: \<Thread(QueueManagerThread, stopped daemon 140318427272960)> Dangling thread: \<Thread(ThreadPoolExecutor-8_7, stopped daemon 140317923735296)> Dangling thread: \<Thread(ThreadPoolExecutor-16_0, stopped daemon 140318427272960)> Dangling thread: \<_MainThread(MainThread, started 140318701401856)>

The job exceeded the maximum time limit for jobs, and has been terminated.

vstinner commented 5 years ago

Possibly first occurrence of this error : https://travis-ci.org/python/cpython/jobs/506783665 after which it's more or less consistent.

That's the first build including my change:

commit 86082c22d23285995a32aabb491527c9f5629556 Author: Victor Stinner \vstinner@redhat.com\ Date: Fri Mar 15 14:57:52 2019 +0100

bpo-36235: Fix CFLAGS in distutils customize_compiler() (GH-12236)

Fix CFLAGS in customize_compiler() of distutils.sysconfig: when the
CFLAGS environment variable is defined, don't override CFLAGS variable with
the OPT variable anymore.

Initial patch written by David Malcolm.

Co-Authored-By: David Malcolm <dmalcolm@redhat.com>

The build starts with:

Setting environment variables from .travis.yml $ export OPENSSL=1.1.0i $ export OPENSSL_DIR="$HOME/multissl/openssl/${OPENSSL}" $ export PATH="${OPENSSL_DIR}/bin:$PATH" $ export CFLAGS="-I${OPENSSL_DIR}/include -O3" $ export LDFLAGS="-L${OPENSSL_DIR}/lib" $ export LD_RUN_PATH="${OPENSSL_DIR}/lib" $ export OPTIONAL=true

Extract of .travis.yml:

env: global:

Maybe it's a bad idea to set CFLAGS globally, and they should only set when building Python itself, not when building C extensions?

To be honest, I don't understand well the relationship between CFLAGS and new "Dangling thread: ..." errors. Maybe it's just unrelated.

Another question is why Travis CI is just fine on PR, but fails on "CRON" jobs?

tirkarthi commented 5 years ago

https://bugs.python.org/issue36414#msg338876

Travis CI config has been changed to use a more recent Ubuntu version, it can explain the failure.

I am confused since the commit changes the linux build to use xenial but the failure is on Mac OS X and it occurs even before the change to xenial that was committed on (March 18, 2019) .

commit 74ae50e53e59bbe39d6287b902757f0cd01327dc Author: CAM Gerlach \CAM.Gerlach@Gerlach.CAM\ Date: Mon Mar 18 05:44:58 2019 -0500

Sample failure before the change : https://travis-ci.org/python/cpython/jobs/506168147 (March 14, 2019)

tirkarthi commented 5 years ago

https://github.com/python/cpython/pull/12708 that seems to fix similar issue (bpo-36544) for Ubuntu that helps in making Mac OS build green again.

Successful build : https://travis-ci.org/python/cpython/jobs/516821454

490c593f-f636-409f-bb35-6abeb38a4595 commented 5 years ago

FWIW PR 12708 has been merged.

tirkarthi commented 5 years ago

The builds are now running since bpo-36684 changed the build process splitting the coverage and there are now three test failures in test_gc, test_descr and test_typing (bpo-36905) unrelated to the original report :

https://travis-ci.org/python/cpython/jobs/531845094#L1873

test test_gc failed -- Traceback (most recent call last):
  File "/home/travis/build/python/cpython/Lib/test/test_gc.py", line 817, in test_get_objects_arguments
    self.assertEqual(len(gc.get_objects()),
AssertionError: 103063 != 103064

https://travis-ci.org/python/cpython/jobs/531845094#L1816

test test_descr failed -- Traceback (most recent call last):
  File "/home/travis/build/python/cpython/Lib/test/test_descr.py", line 1272, in test_slots
    self.assertEqual(orig_objects, new_objects)
AssertionError: 94174 != 94180

This happens in C coverage test suite

https://travis-ci.org/python/cpython/jobs/531845095#L2486

\====================================================================== ERROR: test_build_ext (distutils.tests.test_build_ext.BuildExtTestCase) ----------------------------------------------------------------------

Traceback (most recent call last):
  File "/home/travis/build/python/cpython/Lib/distutils/tests/test_build_ext.py", line 91, in test_build_ext
    import xx
ImportError: /tmp/tmpamh6bkg7/xx.cpython-38-x86_64-linux-gnu.so: undefined symbol: __gcov_merge_add

I am not sure whether to keep this open for three test failures above or to have separate issues. I opened one for test_typing.

vstinner commented 5 years ago

I looked at at recent PR. It's getting better.

"Test code coverage (C)" fails with:

\====================================================================== ERROR: test_build_ext (distutils.tests.test_build_ext.BuildExtTestCase) ----------------------------------------------------------------------

Traceback (most recent call last):
  File "/home/travis/build/python/cpython/Lib/distutils/tests/test_build_ext.py", line 91, in test_build_ext
    import xx
ImportError: /tmp/tmpyufwrt3r/xx.cpython-38-x86_64-linux-gnu.so: undefined symbol: __gcov_merge_add

Maybe this test is failing for a long time. I don't know.

"Test code coverage (Python)":

4 tests failed: test_asyncio test_descr test_gc test_typing

test test_descr failed -- Traceback (most recent call last):
  File "/home/travis/build/python/cpython/Lib/test/test_descr.py", line 1272, in test_slots
    self.assertEqual(orig_objects, new_objects)
AssertionError: 95538 != 95544

Warning -- sys.gettrace was modified by test_audit Before: \<coverage.CTracer object at 0x7faa2f7bfe70> After: None

kumaraditya303 commented 2 years ago

Closing as travis is not used anymore.