Closed 5531d0d8-2a9c-46ba-8b8b-ef76132a492c closed 12 years ago
Running *any* test of the test suite currently produces a bus error on Debian sparc [http://people.debian.org/~aurel32/qemu/sparc/].
After the bus error, the tests seem to proceed normally though.
This is definitely new. I've been testing memoryview for bus errors a couple of months ago without problems.
Georg, I'm provisionally setting this to release blocker. The qemu-sparc image is quite old though (Debian Etch). It's a pity we don't have a sparc buildbot any more.
Example:
user@debian-sparc:~/cpython$ ./python -m test -uall -v test_flufl == CPython 3.3.0b1 (default:67d36e8ddcfc+, Aug 7 2012, 23:49:57) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] Fatal Python error: Bus error
Current thread 0x00004000: File "/home/user/cpython/Lib/subprocess.py", line 1363 in _executechild File "/home/user/cpython/Lib/subprocess.py", line 818 in \_init File "/home/user/cpython/Lib/os.py", line 995 in popen File "/home/user/cpython/Lib/platform.py", line 903 in _syscmd_uname File "/home/user/cpython/Lib/platform.py", line 1147 in uname File "/home/user/cpython/Lib/platform.py", line 1452 in platform File "/home/user/cpython/Lib/test/regrtest.py", line 537 in main File "/home/user/cpython/Lib/test/main__.py", line 13 in \<module> File "/home/user/cpython/Lib/runpy.py", line 73 in _run_code File "/home/user/cpython/Lib/runpy.py", line 160 in _run_module_as_main == Linux-2.6.18-6-sparc32-sparc-with-debian-4.0 big-endian == /home/user/cpython/build/test_python_3262 Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1) [1/1] test_flufl test_barry_as_bdfl (test.test_flufl.FLUFLTests) ... ok test_guido_as_bdfl (test.test_flufl.FLUFLTests) ... ok
---------------------------------------------------------------------- Ran 2 tests in 0.053s
OK 1 test OK.
Setting to critical: debian-sparc 32-bit is apparently deprecated since Lenny and still uses linuxthreads.
Tracking down the failure could end up in finding a platform bug like in bpo-12936.
From the position of the bus error, it would seem that calling a subprocess during platform.platform() is the culprit.
But if test_subprocess passes without any bus errors, that would be strange.
Is it by any chance a --shared build being run from the build directory without having been installed (and without a LD_LIBRARY_PATH and with an older version already installed)?
Running on Solaris 10 (T1000, OpenCSW toolchain, gcc 4.6.3) I also get a bus error, with added coredump:
$ ./python Lib/test/regrtest.py
== CPython 3.3.0b1 (default:67a994d5657d, Aug 8 2012, 21:43:48) [GCC 4.6.3]
== Solaris-2.10-sun4v-sparc-32bit big-endian
== /export/home/flub/python/cpython/build/test_python_7320
Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1)
[ 1/369] test_grammar
[ 2/369] test_opcodes
[ 3/369] test_dict
[ 4/369] test_builtin
[ 5/369] test_exceptions
test test_exceptions failed -- Traceback (most recent call last):
File "/export/home/flub/python/cpython/Lib/test/test_exceptions.py", line 432, in testChainingDescriptors
self.assertTrue(e.__suppress_context__)
AssertionError: False is not true
[ 6/369/1] test_types [ 7/369/1] test_unittest [ 8/369/1] test_doctest [ 9/369/1] test_doctest2 [ 10/369/1] test_support [ 11/369/1] test_all_ [ 12/369/1] testfuture [ 13/369/1] testlocale [ 14/369/1] test__osx_support [ 15/369/1] test_abc [ 16/369/1] test_abstract_numbers [ 17/369/1] test_aifc [ 18/369/1] test_argparse [ 19/369/1] test_array [ 20/369/1] test_ast [ 21/369/1] test_asynchat [ 22/369/1] test_asyncore [ 23/369/1] test_atexit [ 24/369/1] test_audioop [ 25/369/1] test_augassign [ 26/369/1] test_base64 [ 27/369/1] test_bigaddrspace [ 28/369/1] test_bigmem [ 29/369/1] test_binascii [ 30/369/1] test_binhex [ 31/369/1] test_binop [ 32/369/1] test_bisect [ 33/369/1] test_bool [ 34/369/1] test_buffer [ 35/369/1] test_bufio [ 36/369/1] test_bytes [ 37/369/1] test_bz2 [ 38/369/1] test_calendar [ 39/369/1] test_call [ 40/369/1] test_capi Fatal Python error: Bus error
Current thread 0x00000001: File "/export/home/flub/python/cpython/Lib/test/test_capi.py", line 264 in testskipitem File "/export/home/flub/python/cpython/Lib/unittest/case.py", line 385 in _executeTestPart File "/export/home/flub/python/cpython/Lib/unittest/case.py", line 440 in run File "/export/home/flub/python/cpython/Lib/unittest/case.py", line 492 in \_call File "/export/home/flub/python/cpython/Lib/unittest/suite.py", line 105 in run File "/export/home/flub/python/cpython/Lib/unittest/suite.py", line 67 in __call File "/export/home/flub/python/cpython/Lib/unittest/suite.py", line 105 in run File "/export/home/flub/python/cpython/Lib/unittest/suite.py", line 67 in __call__ File "/export/home/flub/python/cpython/Lib/test/support.py", line 1312 in run File "/export/home/flub/python/cpython/Lib/test/support.py", line 1413 in _run_suite File "/export/home/flub/python/cpython/Lib/test/support.py", line 1447 in run_unittest File "/export/home/flub/python/cpython/Lib/test/test_capi.py", line 290 in test_main File "Lib/test/regrtest.py", line 1219 in runtest_inner File "Lib/test/regrtest.py", line 941 in runtest File "Lib/test/regrtest.py", line 714 in main File "Lib/test/regrtest.py", line 1810 in \<module> Bus Error (core dumped)
Not sure if this should be tracked in the same issue or not?
I think I've identified one legit Python bug. This is from a *different* traceback, i.e. the traceback in my first message is still unresolved.
A bus error occurs in test_capi, test_skipitem with format 'D':
Py_complex *p = va_arg(*p_va, Py_complex *);
Py_complex cval;
cval = PyComplex_AsCComplex(arg);
if (PyErr_Occurred())
RETURN_ERR_OCCURRED;
else
*p = cval; <- bus error
break;
The pointer p has value 0xefbfb1fc, with 0xefbfb1fc % 8 == 4. It originates from a somewhat creatively allocated memory region in _testcapi:parse_tuple_and_keywords. :)
This platform is 8-byte aligned?
nm, I get it, doubles are 8-bytes and should be 8-byte aligned. Let me stare at it some more.
Floris, the traceback in my first message only occurs in the optimized regular build with -O3. Did you try that, too?
Attached is a patch attempting to force double alignment. Stefan: please apply and try it. Does this help?
I compiled with a simple "./configure" which I think is what you mean (it defaults to -O3). But when executing your test it doesn't give a bus error.
Larry Hastings \report@bugs.python.org\ wrote:
Attached is a patch attempting to force double alignment. Stefan: please apply and try it. Does this help?
Yes, this works nicely.
New changeset efb30bdcfa1e by Larry Hastings in branch 'default': Issue bpo-15589: Ensure double-alignment for brute-force capi argument parser test http://hg.python.org/cpython/rev/efb30bdcfa1e
I think I can confirm this fixes the BusError. The test suite got past test_capi on my machine as well. Unfortunately I killed the ssh session by accident before the testsuite completed so I had to restart it.
As for the original error: in test_subprocess basically every test fails. With the standard regrtest.py (faulthandler enabled), most tests generate a bus error in subprocess_fork_exec():
621 cwd_obj2 = NULL; (gdb) 624 pid = fork(); \<- bus error (gdb) Fatal Python error: Bus error
Current thread 0x00004000: File "/home/user/cpython/Lib/subprocess.py", line 1363 in _executechild File "/home/user/cpython/Lib/subprocess.py", line 818 in \_init__ File "/home/user/cpython/Lib/test/test_subprocess.py", line 728 in test_bufsize_is_none
621 cwd_obj2 = NULL; (gdb) 624 pid = fork(); \<- bus error (gdb) Fatal Python error: Bus error
Current thread 0x00004000: File "/home/user/cpython/Lib/subprocess.py", line 1363 in _executechild File "/home/user/cpython/Lib/subprocess.py", line 818 in \_init__ File "/home/user/cpython/Lib/test/test_subprocess.py", line 728 in test_bufsize_is_none
With all faulthandler references removed from regrtest.py no bus errors happen, but most tests fail anyway. As I said, I'm NOT blaming faulthandler, but suspect some strange platform bug that perhaps involves linuxthreads.
Since Floris can't reproduce this error, I'm setting the priority to normal.
I can now confirm the whole testsuite runs, so the BusError part seems fixed on my host:
329 tests OK. 7 tests failed: test_cmd_line test_exceptions test_ipaddress test_os test_raise test_socket test_traceback 1 test altered the execution environment: test_site 32 tests skipped: test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp test_codecmaps_kr test_codecmaps_tw test_curses test_dbm_gnu test_epoll test_gdb test_kqueue test_lzma test_msilib test_ossaudiodev test_pep277 test_readline test_smtpnet test_socketserver test_sqlite test_ssl test_startfile test_tcl test_timeout test_tk test_ttk_guionly test_ttk_textonly test_unicode_file test_urllib2net test_urllibnet test_winreg test_winsound test_xmlrpc_net test_zipfile64 8 skips unexpected on sunos5: test_lzma test_readline test_smtpnet test_ssl test_tcl test_tk test_ttk_guionly test_ttk_textonly
329 tests OK. 7 tests failed: test_cmd_line test_exceptions test_ipaddress test_os test_raise test_socket test_traceback
Thanks. A lot of these appear to be big-endian related, see bpo-15597.
With all faulthandler references removed from regrtest.py no bus errors happen, but most tests fail anyway. As I said, I'm NOT blaming faulthandler, but suspect some strange platform bug that perhaps involves linuxthreads.
Threads + signal is a very complex problem. It is not solved yet in OpenBSD for example. There were a lot of such issues on old versions of FreeBSD. Extract of the Wikipedia article of LinuxThreads:
"LinuxThreads had a number of problems, mainly owing to the implementation, which used the clone system call to create a new process sharing the parent's address space. For example, threads had distinct process identifiers, causing problems for signal handling; (...)"
If disabling faulthandler avoids new issues, you can add 'if sys.thread_info.version.startswith("linuxthreads"):" on the line:
faulthandler.enable(all_threads=True)
in regrtest.py.
I added sys.thread_info to be able to skip some tests only failing on LinuxThreads...
--
but most tests fail anyway
Ah? With which message? Can you get more information in gdb?
If disabling faulthandler avoids new issues, you can add 'if [not] sys.thread_info.version.startswith("linuxthreads")'
That suppresses some bus errors. However, they still occur without being raised (some print statements and a WIFSIGNALED test inserted in posix_waitpid):
>>> import subprocess, os
>>> p = subprocess.Popen(["/bin/true"])
>>> os.waitpid(p.pid, os.WNOHANG)
pid: 4461 options: 1
signo: 10
(4461, 10)
>>>
So a bus error occurs in waitpid(pid, &status, options).WAIT_TYPE is int, perhaps that's incorrect for the platform, but I can't get hold of the posix man pages for debian-etch-sparc.
I'd like to urge everybody to focus at one issue at a time. This issue is about Python crashing on a SparcLinux qemu image, so I think it should have priority "low" - there is absolutely no requirement that this needs to work.
As for the test failures on Solaris - please report them as separate issues (one per failure, "normal" priority seems right).
Closing since the remaining issue is almost certainly a platform bug.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at =
created_at =
labels = ['type-crash']
title = 'Bus error on Debian sparc'
updated_at =
user = 'https://github.com/skrah'
```
bugs.python.org fields:
```python
activity =
actor = 'skrah'
assignee = 'none'
closed = True
closed_date =
closer = 'skrah'
components = []
creation =
creator = 'skrah'
dependencies = []
files = ['26727']
hgrepos = []
issue_num = 15589
keywords = []
message_count = 21.0
messages = ['167678', '167679', '167701', '167706', '167713', '167714', '167715', '167716', '167717', '167718', '167723', '167724', '167725', '167728', '167733', '167735', '167736', '167737', '167777', '167805', '168030']
nosy_count = 8.0
nosy_names = ['loewis', 'georg.brandl', 'vstinner', 'larry', 'flub', 'ned.deily', 'skrah', 'python-dev']
pr_nums = []
priority = 'normal'
resolution = 'wont fix'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'crash'
url = 'https://bugs.python.org/issue15589'
versions = ['Python 3.3']
```