python / cpython

The Python programming language
https://www.python.org
Other
63.2k stars 30.26k forks source link

[2.7] regrtest hangs on Python 2.7 (test_threading?) #74536

Closed vstinner closed 7 years ago

vstinner commented 7 years ago
BPO 30351
Nosy @vstinner
PRs
  • python/cpython#2317
  • python/cpython#3019
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['tests'] title = '[2.7] regrtest hangs on Python 2.7 (test_threading?)' updated_at = user = 'https://github.com/vstinner' ``` bugs.python.org fields: ```python activity = actor = 'vstinner' assignee = 'none' closed = True closed_date = closer = 'vstinner' components = ['Tests'] creation = creator = 'vstinner' dependencies = [] files = [] hgrepos = [] issue_num = 30351 keywords = ['buildbot'] message_count = 21.0 messages = ['293543', '293690', '293930', '294054', '294055', '294951', '294953', '295884', '295916', '296083', '296334', '296335', '296490', '296586', '296587', '296592', '296859', '296860', '296969', '297037', '297581'] nosy_count = 1.0 nosy_names = ['vstinner'] pr_nums = ['2317', '3019'] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue30351' versions = ['Python 2.7'] ```

    vstinner commented 7 years ago

    The tests step was killed after a total of 55 min 54 sec, whereas it only took 38 min 40 sec on the previous build 150.

    regrtest was supposed to log "running: test_threading (xxx sec)" every 30 seconds, but it didn't!? I don't know if the main regrtest process was stuck, or the child process running test_threading was stuck? Or both?

    http://buildbot.python.org/all/builders/x86%20Windows%20XP%202.7/builds/151/steps/test/logs/stdio

    test_pty skipped -- No module named fcntl 0:35:45 [400/402] test_pep352 passed -- running: test_threading (448 sec) 0:35:47 [401/402] test_timeit passed -- running: test_threading (450 sec)

    command timed out: 1200 seconds without output running ['Tools\\buildbot\\test.bat', '-j2'], attempting to kill program finished with exit code 1 elapsedTime=3354.584000

    vstinner commented 7 years ago

    Similar issue on x86 Tiger 2.7?

    http://buildbot.python.org/all/builders/x86%20Tiger%202.7/builds/151/steps/test/logs/stdio

    0:24:55 [371/402] test_pep277 passed -- running: test_xpickle (613 sec) 0:24:56 [372/402] test_shlex passed -- running: test_xpickle (615 sec) 0:25:03 [373/402] test_random passed -- running: test_xpickle (621 sec)

    command timed out: 1200 seconds without output running ['make', 'buildbottest', 'TESTOPTS=-j2', 'TESTPYTHONOPTS=', 'TESTTIMEOUT=900'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=2707.762029

    vstinner commented 7 years ago

    Another one on AMD64 Windows7 SP1 2.7:

    http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%202.7/builds/92/steps/test/logs/stdio

    ... 0:03:30 [377/403] test_code passed 0:03:30 [378/403] test_wait4 skipped -- running: test_mmap (30 sec) test_wait4 skipped -- module 'os' has no attribute 'fork' 0:03:30 [379/403] test_buffer passed -- running: test_mmap (30 sec) 0:03:43 [380/403] test_mmap passed (43 sec) -- running: test_httpservers (39 sec) 0:03:44 [381/403] test_decimal passed -- running: test_httpservers (39 sec) 0:03:44 [382/403] test_new passed -- running: test_httpservers (40 sec) 0:03:44 [383/403] test_md5 passed -- running: test_httpservers (40 sec) 0:03:44 [384/403] test_sys_setprofile passed -- running: test_httpservers (40 sec) 0:03:46 [385/403] test_heapq passed -- running: test_httpservers (42 sec) 0:03:46 [386/403] test_funcattrs passed -- running: test_httpservers (42 sec) 0:03:46 [387/403] test__osx_support passed -- running: test_httpservers (42 sec) 0:03:47 [388/403] test_threading passed -- running: test_httpservers (43 sec) [26108 refs] [27135 refs] 0:03:47 [389/403] test_pprint passed -- running: test_httpservers (43 sec) 0:03:49 [390/403] test_re passed -- running: test_httpservers (45 sec) 0:03:50 [391/403] test_richcmp passed -- running: test_httpservers (46 sec) 0:03:51 [392/403] test_linecache passed -- running: test_httpservers (46 sec) 0:03:51 [393/403] test_threadsignals skipped -- running: test_httpservers (47 sec) test_threadsignals skipped -- Can't test signal on win32 0:03:51 [394/403] test_binascii passed -- running: test_httpservers (47 sec) 0:03:52 [395/403] test_email_renamed passed -- running: test_httpservers (47 sec) 0:03:52 [396/403] test_copy_reg passed -- running: test_httpservers (48 sec) 0:03:52 [397/403] test_ttk_textonly passed -- running: test_httpservers (48 sec) 0:03:53 [398/403] test_bigmem passed -- running: test_httpservers (48 sec) 0:03:53 [399/403] test_threading_local passed -- running: test_httpservers (49 sec) 0:03:53 [400/403] test_wait3 skipped -- running: test_httpservers (49 sec) test_wait3 skipped -- os.fork not defined -- skipping test_wait3 0:03:57 [401/403] test_lib2to3 passed -- running: test_httpservers (52 sec) [34066 refs] 0:04:04 [402/403] test_httpservers passed (60 sec)

    command timed out: 1200 seconds without output, attempting to kill program finished with exit code 1 elapsedTime=1446.138000

    vstinner commented 7 years ago

    New fail on x86 Windows XP 2.7:

    http://buildbot.python.org/all/builders/x86%20Windows%20XP%202.7/builds/166/steps/test/logs/stdio

    ... 0:35:19 [401/403] test_email_codecs passed -- running: test_threading (70 sec) 0:35:20 [402/403] test_gettext passed -- running: test_threading (71 sec)

    command timed out: 1200 seconds without output running ['Tools\\buildbot\\test.bat', '-j2'], attempting to kill program finished with exit code 1 elapsedTime=3330.414000

    vstinner commented 7 years ago

    Hum, it's the second time that regrtest hangs on x86 Windows XP 2.7 while test_threading is running.

    vstinner commented 7 years ago

    Similar issue on AMD64 OpenIndiana 2.7:

    http://buildbot.python.org/all/builders/AMD64%20OpenIndiana%202.7/builds/156/steps/test/logs/stdio

    ... 0:10:05 [400/403] test_old_mailbox passed -- running: test_bsddb3 (590 sec) 0:10:07 [401/403] test_imaplib passed -- running: test_bsddb3 (591 sec) running: test_decimal (34 sec), test_bsddb3 (621 sec) 0:10:40 [402/403] test_decimal passed (37 sec) -- running: test_bsddb3 (625 sec)

    command timed out: 1200 seconds without output running ['make', 'buildbottest', 'TESTOPTS=-j2 -j4', 'TESTPYTHONOPTS=', 'TESTTIMEOUT=900'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=1848.193078

    vstinner commented 7 years ago

    New failure on AMD64 Windows7 SP1 2.7:

    http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%202.7/builds/107/steps/test/logs/stdio

    ... 0:04:31 [402/403] test_ssl passed (32 sec) -- running: test_weakref (33 sec) Resource 'sha256.tbs-internet.com' is not available Resource 'ipv6.google.com' is not available

    command timed out: 1200 seconds without output, attempting to kill program finished with exit code 1 elapsedTime=1473.735000

    vstinner commented 7 years ago

    Similar issue on AMD64 Windows7 SP1 2.7:

    http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%202.7/builds/123/steps/test/logs/stdio

    (...) 0:04:18 [401/403] test_posixpath passed -- running: test_weakref (48 sec) 0:04:19 [402/403] test_bsddb passed -- running: test_weakref (48 sec)

    command timed out: 1200 seconds without output, attempting to kill program finished with exit code 1 elapsedTime=1461.020000

    vstinner commented 7 years ago

    Another failure.

    http://buildbot.python.org/all/builders/AMD64%20Windows10%202.7/builds/201/steps/test/logs/stdio

    ...

    0:19:34 [401/404] test_ftplib passed 0:19:44 [402/404] test_bufio passed 0:19:45 [403/404] test_SimpleHTTPServer passed

    command timed out: 1200 seconds without output running ['Tools\\buildbot\\test.bat', '-x64', '-j2'], attempting to kill program finished with exit code 1 elapsedTime=2388.552000

    vstinner commented 7 years ago

    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Debug%202.7/builds/136/steps/test/logs/stdio

    ... 0:12:56 [402/404] test_linecache passed -- running: test_threading (413 sec) 0:12:56 [403/404] test_int_literal passed -- running: test_threading (413 sec)

    command timed out: 1200 seconds without output running ['make', 'buildbottest', 'TESTOPTS=-j2', 'TESTPYTHONOPTS=', 'TESTTIMEOUT=900'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=1978.093713

    vstinner commented 7 years ago

    Yet another bug. I see a pattern, a lot of tests ends with test_threading running in background, test_threading is the last test to run, and then regrtest stops logging anything and buildbot kills the test.

    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.x%20Shared%202.7/builds/126/steps/test/logs/stdio

    ... 0:05:18 [399/404] test_scope passed -- running: test_threading (128 sec) 0:05:19 [400/404] test_longexp passed -- running: test_threading (129 sec) 0:05:19 [401/404] test_strptime passed -- running: test_threading (129 sec) 0:05:20 [402/404] test_logging passed -- running: test_threading (130 sec) 0:05:27 [403/404] test_select passed -- running: test_threading (137 sec)

    command timed out: 1200 seconds without output running ['make', 'buildbottest', 'TESTOPTS=-j2 -j4', 'TESTPYTHONOPTS=', 'TESTTIMEOUT=900'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=1528.143168

    vstinner commented 7 years ago

    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Non-Debug%202.7/builds/140/steps/test/logs/stdio

    ... 0:10:17 [400/404] test_binhex passed -- running: test_threading (474 sec) 0:10:17 [401/404] test_pdb passed -- running: test_threading (474 sec) 0:10:17 [402/404] test_parser passed -- running: test_threading (475 sec) 0:10:17 [403/404] test_gdb skipped -- running: test_threading (475 sec) test_gdb skipped -- gdb versions before 7.0 didn't support python embedding. Saw 6.1: GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd".

    command timed out: 1200 seconds without output running ['make', 'buildbottest', 'TESTOPTS=-j2', 'TESTPYTHONOPTS=', 'TESTTIMEOUT=900'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=1818.589810

    vstinner commented 7 years ago

    http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%202.7/builds/131/steps/test/logs/stdio

    ... 0:05:06 [401/404] test_strptime passed 0:05:07 [402/404] test_test_support passed 0:05:29 [403/404] test_largefile passed (39 sec) -- running: test_weakref (44 sec)

    command timed out: 1200 seconds without output, attempting to kill program finished with exit code 1 elapsedTime=1530.908000

    vstinner commented 7 years ago

    http://buildbot.python.org/all/builders/x86%20Windows%20XP%20VS9.0%202.7/builds/203/steps/test/logs/stdio

    ...

    0:18:49 [401/404] test_zipimport passed -- running: test_threading (158 sec) 0:18:52 [402/404] test_zipimport_support passed -- running: test_threading (161 sec) d:\cygwin\home\db3l\buildarea\2.7.bolen-windowsvs9\build\lib\test\support\init.py:803: RuntimeWarning: tests may fail, unable to create temp dir: d:\cygwin\home\db3l\buildarea\2.7.bolen-windowsvs9\build\build\test_python_3408 with temp_dir(path=name, quiet=quiet) as temp_path: 0:19:03 [403/404] test_zlib passed -- running: test_threading (172 sec) d:\cygwin\home\db3l\buildarea\2.7.bolen-windowsvs9\build\lib\test\support\init.py:803: RuntimeWarning: tests may fail, unable to create temp dir: d:\cygwin\home\db3l\buildarea\2.7.bolen-windowsvs9\build\build\test_python_4072 with temp_dir(path=name, quiet=quiet) as temp_path:

    command timed out: 1200 seconds without output running ['PC\\VS9.0\\rt.bat', '-q', '-d', '-j2'], attempting to kill program finished with exit code 1 elapsedTime=2347.382000

    vstinner commented 7 years ago

    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.x%20Shared%202.7/builds/129/steps/test/logs/stdio

    ... 0:06:35 [402/404] test_signal passed (46 sec) -- running: test_lib2to3 (63 sec), test_threading (395 sec) 0:07:01 [403/404] test_lib2to3 passed (89 sec) -- running: test_threading (422 sec) [33834 refs]

    command timed out: 1200 seconds without output running ['make', 'buildbottest', 'TESTOPTS=-j2 -j4', 'TESTPYTHONOPTS=', 'TESTTIMEOUT=900'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=1622.640079

    vstinner commented 7 years ago

    I created bpo-30727: "[2.7] test_threading.ConditionTests.test_notify() hangs randomly on FreeBSD on Python 2.7".

    vstinner commented 7 years ago

    http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%20VS9.0%202.7/builds/137/steps/test/logs/stdio

    0:03:12 [401/404] test_zipfile passed 0:03:12 [402/404] test_zlib passed 0:03:15 [403/404] test_urllib2_localnet passed

    command timed out: 1200 seconds without output, attempting to kill program finished with exit code 1 elapsedTime=1396.358000

    vstinner commented 7 years ago

    http://buildbot.python.org/all/builders/ARMv7%20Ubuntu%203.5/builds/324/steps/test/logs/stdio

    0:08:44 load avg: 7.83 [394/398] test_asyncio 0:09:12 load avg: 6.52 [395/398] test_pickle 0:09:40 load avg: 5.13 [396/398] test_tools 0:10:27 load avg: 3.48 [397/398] test_tokenize

    command timed out: 1200 seconds without output running ['make', 'buildbottest', 'TESTOPTS=-j2 -j8', 'TESTPYTHONOPTS=', 'TESTTIMEOUT=900'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=1830.667627

    vstinner commented 7 years ago

    The bpo-30727 (test_threading hangs in ConditionTests.test_notify()) has been fixed by my commit da6d305b6fcd49ba1224b1fd2131d7648a5be6b9, followed by a better made by Serhiy: commit 7709b4d57b433ef027a2e7e63b4cab3fc9ad910d (Serhiy applied his fix to all branches).

    I hope that this bug (regrtest hangs on Python 2.7) will slowly become less likely!

    vstinner commented 7 years ago

    http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%202.7/builds/140/steps/test/logs/stdio

    0:06:09 [401/404] test_runpy passed 0:06:09 [402/404] test_htmlparser passed 0:06:10 [403/404] test_smtpnet passed

    command timed out: 1200 seconds without output, attempting to kill program finished with exit code 1 elapsedTime=1571.765000

    vstinner commented 7 years ago

    The main bug was fixed by the commit 7709b4d57b433ef027a2e7e63b4cab3fc9ad910d, so I now close this issue.

    I will open a new issue if I see new failures after the fix, like msg297037 (which only occurred once).