python / cpython

The Python programming language
https://www.python.org/
Other
60.06k stars 29.09k forks source link

test_run_fileexflags() crash on PPC64LE RHEL7 LTO + PGO 3.x #118422

Closed vstinner closed 3 weeks ago

vstinner commented 3 weeks ago

Bug report

build: https://buildbot.python.org/all/#/builders/43/builds/5305/steps/5/logs/stdio

Logs:

./python -E  -m test --slow-ci --timeout=1200 -j2 --junit-xml test-results.xml -j10 
+ ./python -u -W default -bb -E -m test --slow-ci --timeout=1200 -j2 --junit-xml test-results.xml -j10 --dont-add-python-opts
== CPython 3.13.0a6+ (heads/main:8b56d82, Apr 29 2024, 18:49:30) [GCC 8.3.1 20190311 (Red Hat 8.3.1-3)]
== Linux-3.10.0-1160.114.2.el7.ppc64le-ppc64le-with-glibc2.17 little-endian
== Python build: release LTO+PGO
== cwd: /home/buildbot/buildarea/3.x.cstratak-RHEL7-ppc64le.lto-pgo/build/build/test_python_worker_24730æ
== CPU count: 8
== encodings: locale=UTF-8 FS=utf-8
== resources: all

(...)

test_atomic_load_store_uint64 (test.test_capi.test_pyatomic.PyAtomicTests.test_atomic_load_store_uint64) ... ok
test_atomic_load_store_uint8 (test.test_capi.test_pyatomic.PyAtomicTests.test_atomic_load_store_uint8) ... ok
test_atomic_load_store_uintptr (test.test_capi.test_pyatomic.PyAtomicTests.test_atomic_load_store_uintptr) ... ok
test_atomic_release_acquire (test.test_capi.test_pyatomic.PyAtomicTests.test_atomic_release_acquire) ... ok
test_run_fileexflags (test.test_capi.test_run.CAPITest.test_run_fileexflags) ...

Fatal Python error: Segmentation fault
Current thread 0x00003fff876747f0 (most recent call first):
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-ppc64le.lto-pgo/build/Lib/test/test_capi/test_run.py", line 72 in run
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-ppc64le.lto-pgo/build/Lib/test/test_capi/test_run.py", line 78 in test_run_fileexflags
  (...)

Extension modules: _testinternalcapi, _testcapi, _testlimitedcapi, _testmultiphase, _testsinglephase (total: 5)

Linked PRs

vstinner commented 3 weeks ago

The bug can be reproduced just with these 2 tests:

$ cat bisect8
test.test_capi.test_misc.SubinterpreterTest.test_configured_settings
test.test_capi.test_run.CAPITest.test_run_fileexflags
$ ./python -m test test_capi -v --matchfile=bisect8

Sometimes, I get the glibc error message:

Fatal error: glibc detected an invalid stdio handle
vstinner commented 3 weeks ago

The crash is more likely if PYC files are removed:

rm -rf ./Lib/__pycache__ ./Lib/test/test_capi/__pycache__
vstinner commented 3 weeks ago

I cannot reproduce the issue with gcc -O0 and gcc -Og, only with a release build: gcc -O3.

This buildbot uses an old GCC version: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44).

serhiy-storchaka commented 3 weeks ago

If it is only reproducible with a release build, would the following patch help?

diff --git a/Modules/_testcapi/run.c b/Modules/_testcapi/run.c
index 4fd98b82d76..b35a12e9424 100644
--- a/Modules/_testcapi/run.c
+++ b/Modules/_testcapi/run.c
@@ -74,6 +74,7 @@ run_fileexflags(PyObject *mod, PyObject *pos_args)

     result = PyRun_FileExFlags(fp, filename, start, globals, locals, closeit, pflags);

+#ifndef NDEBUG
 #if defined(__linux__) || defined(MS_WINDOWS) || defined(__APPLE__)
     /* The behavior of fileno() after fclose() is undefined, but it is
      * the only practical way to check whether the file was closed.
@@ -85,6 +86,7 @@ run_fileexflags(PyObject *mod, PyObject *pos_args)
         return NULL;
     }
 #endif
+#endif // NDEBUG
     if (!closeit && fileno(fp) < 0) {
         PyErr_SetString(PyExc_AssertionError, "Bad file descriptor after excution");
         Py_XDECREF(result);

Or maybe just remove this dangerous implementation depending code?

vstinner commented 3 weeks ago

Or maybe just remove this dangerous implementation depending code?

I wrote PR gh-118429 to use fstat() instead of relying on an undefined behavior of fileno() on a closed file.