Open hroncok opened 1 year ago
Doctest: gil_in_var_initialization_tests.test_method_with_error_return ... Fatal Python error: Segmentation fault
Thread 0xf7e2f700 (most recent call first):
File "<doctest gil_in_var_initialization_tests.test_method_with_error_return[0]>", line 1 in <module>
File "/usr/lib/python3.12/doctest.py", line 1357 in __run
File "/usr/lib/python3.12/doctest.py", line 1504 in run
File "/usr/lib/python3.12/doctest.py", line 2222 in runTest
File "/usr/lib/python3.12/unittest/case.py", line 589 in _callTestMethod
File "/usr/lib/python3.12/unittest/case.py", line 634 in run
File "/usr/lib/python3.12/unittest/case.py", line 690 in __call__
File "/usr/lib/python3.12/unittest/suite.py", line 122 in run
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 1562 in run_test
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 1568 in run_forked_test
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 1563 in run_doctests
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 1549 in run_tests
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 1531 in run
File "/usr/lib/python3.12/unittest/case.py", line 690 in __call__
File "/usr/lib/python3.12/unittest/suite.py", line 122 in run
File "/usr/lib/python3.12/unittest/suite.py", line 84 in __call__
File "/usr/lib/python3.12/unittest/suite.py", line 122 in run
File "/usr/lib/python3.12/unittest/suite.py", line 84 in __call__
File "/usr/lib/python3.12/unittest/suite.py", line 122 in run
File "/usr/lib/python3.12/unittest/suite.py", line 84 in __call__
File "/usr/lib/python3.12/unittest/runner.py", line 240 in run
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 2952 in runtests
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 2674 in runtests_callback
File "/usr/lib/python3.12/multiprocessing/pool.py", line 125 in worker
File "/usr/lib/python3.12/multiprocessing/process.py", line 108 in run
File "/usr/lib/python3.12/multiprocessing/process.py", line 314 in _bootstrap
File "/usr/lib/python3.12/multiprocessing/popen_fork.py", line 71 in _launch
File "/usr/lib/python3.12/multiprocessing/popen_fork.py", line 19 in __init__
File "/usr/lib/python3.12/multiprocessing/context.py", line 282 in _Popen
File "/usr/lib/python3.12/multiprocessing/process.py", line 121 in start
File "/usr/lib/python3.12/multiprocessing/pool.py", line 329 in _repopulate_pool_static
File "/usr/lib/python3.12/multiprocessing/pool.py", line 306 in _repopulate_pool
File "/usr/lib/python3.12/multiprocessing/pool.py", line 215 in __init__
File "/usr/lib/python3.12/multiprocessing/context.py", line 119 in Pool
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 2515 in main
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 3004 in <module>
Extension modules: cython.cimports.libc.math, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, refnanny, gil_in_var_initialization_tests (total: 16)
That's the relevant part of the build log. (The tests use multiprocessing.Pool
and that doesn't provide a way of telling when one of the workers has crashed so it appears to hang. There's no point in leaving it going...)
I'm trying to work out how I actually get a 32 bit build of Python to test.
There's no need for you to bisect it I think. It's a new test in this release so I know where it was introduced from. The question is if it's a bug in Cython, a bug in the test, or a bug in something else. I suspect most likely a bug in the test
I was just about to release 3.0.4. I'll wait to see if this turns out to be something to fix in that release.
You could try a 32bit Python docker image to test it locally.
I managed to test this (with 32-bit Opensuse in virtualbox, just because Opensuse is what I use most of the time so I know how to install stuff).
I can't reproduce the issue. I tried with Python 3.11.5 and Python 3.12.0. Current master and 3.0.3 exactly. It runs fine. I'm using gcc 13.2.1.
I've only tried running the specific test via python3 runtests.py -vv gil_in_var
. It's possible it crashes in the complete test suite but I don't have time to check that out this evening.
For the moment the easiest thing to do in Fedora is just to exclude the specific test. You can do that by adding the name to tests/bug.txt.
Running just that one test makes it crash:
+ /usr/bin/python3 runtests.py -vv gil_in_var
Python 3.12.0 (main, Oct 5 2023, 00:00:00) [GCC 13.2.1 20230918 (Red Hat 13.2.1-3)]
Running tests against Cython 3.0.3
Using Cython language level 2.
Backends: c,cpp
runTest (__main__.CythonRunTestCase.runTest)
[-1] compiling (cpp/cy2) and running gil_in_var_initialization_tests ...
#### 2023-10-17 09:25:02.817490
#### 2023-10-17 09:25:12.822023
test_method_with_error_return (gil_in_var_initialization_tests)
Doctest: gil_in_var_initialization_tests.test_method_with_error_return ... Fatal Python error: Segmentation fault
Thread 0xf68aab40 (most recent call first):
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 2623 in time_stamper
File "/usr/lib/python3.12/threading.py", line 989 in run
File "/usr/lib/python3.12/threading.py", line 1052 in _bootstrap_inner
File "/usr/lib/python3.12/threading.py", line 1009 in _bootstrap
Thread 0xf7f66700 (most recent call first):
File "<doctest gil_in_var_initialization_tests.test_method_with_error_return[0]>", line 1 in <module>
File "/usr/lib/python3.12/doctest.py", line 1357 in __run
File "/usr/lib/python3.12/doctest.py", line 1504 in run
File "/usr/lib/python3.12/doctest.py", line 2222 in runTest
File "/usr/lib/python3.12/unittest/case.py", line 589 in _callTestMethod
File "/usr/lib/python3.12/unittest/case.py", line 634 in run
File "/usr/lib/python3.12/unittest/case.py", line 690 in __call__
File "/usr/lib/python3.12/unittest/suite.py", line 122 in run
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 1562 in run_test
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 1568 in run_forked_test
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 1563 in run_doctests
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 1549 in run_tests
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 1531 in run
File "/usr/lib/python3.12/unittest/case.py", line 690 in __call__
File "/usr/lib/python3.12/unittest/suite.py", line 122 in run
File "/usr/lib/python3.12/unittest/suite.py", line 84 in __call__
File "/usr/lib/python3.12/unittest/suite.py", line 122 in run
File "/usr/lib/python3.12/unittest/suite.py", line 84 in __call__
File "/usr/lib/python3.12/unittest/suite.py", line 122 in run
File "/usr/lib/python3.12/unittest/suite.py", line 84 in __call__
File "/usr/lib/python3.12/unittest/runner.py", line 240 in run
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 2952 in runtests
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 2550 in main
File "/builddir/build/BUILD/cython-3.0.3/runtests.py", line 3004 in <module>
Extension modules: cython.cimports.libc.math, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, gil_in_var_initialization_tests, refnanny (total: 16)
/var/tmp/rpm-tmp.E9IWo1: line 47: 4153 Segmentation fault (core dumped) /usr/bin/python3 runtests.py -vv gil_in_var
Full log: build.log.txt
If you are confident this is a problem in the test itself, I'll exclude it. Perhaps this is specific to our build flags?
My current view is that it's likely an issue with either the test or the compilation environment more than a bug in Cython and so it probably should hold anything up. But I'm definitely not certain of that.
I could try having a look on "Fedora Linux 40 i686" (I suspect it'd be very easy to tell what's wrong in a C++ debugger) but I can't find any evidence of any recent 32 bit version of Fedora to download.
Fedora does not build i686 kernels anymore. We build for i686 only for the infamous "multilib" case. To get it, you can use e.g. mock (omit the initial podman ...
to su - mockbuilder
if already on a Fedora system, e.g. in VirtualBox).
$ podman run --rm --privileged -ti fedora:rawhide bash
[root@201a48537c14 /]# dnf install -y mock
...
[root@201a48537c14 /]# useradd mockbuilder
[root@201a48537c14 /]# usermod -a -G mock mockbuilder
[root@201a48537c14 /]# su - mockbuilder
[mockbuilder@201a48537c14 ~]$ mock -r fedora-rawhide-i386 --no-bootstrap-image --no-bootstrap-chroot --init
...
[mockbuilder@201a48537c14 ~]$ mock -r fedora-rawhide-i386 --no-bootstrap-image --no-bootstrap-chroot --install git-core python3-devel python3-setuptools gcc-c++ gdb
...
[mockbuilder@201a48537c14 ~]$ mock -r fedora-rawhide-i386 --no-bootstrap-image --no-bootstrap-chroot --shell --enable-network --unpriv
...
<mock-chroot> sh-5.2$ cd
<mock-chroot> sh-5.2$ git clone https://github.com/cython/cython.git
<mock-chroot> sh-5.2$ cd cython/
<mock-chroot> sh-5.2$ rpm --eval '%set_build_flags'
CFLAGS="${CFLAGS:--O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Werror=implicit-function-declaration -Werror=implicit-int -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m32 -march=i686 -mtune=generic -msse2 -mfpmath=sse -mstackrealign -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection }" ; export CFLAGS ;
CXXFLAGS="${CXXFLAGS:--O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m32 -march=i686 -mtune=generic -msse2 -mfpmath=sse -mstackrealign -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection }" ; export CXXFLAGS ;
FFLAGS="${FFLAGS:--O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m32 -march=i686 -mtune=generic -msse2 -mfpmath=sse -mstackrealign -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -I/usr/lib/gfortran/modules }" ; export FFLAGS ;
FCFLAGS="${FCFLAGS:--O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m32 -march=i686 -mtune=generic -msse2 -mfpmath=sse -mstackrealign -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -I/usr/lib/gfortran/modules }" ; export FCFLAGS ;
VALAFLAGS="${VALAFLAGS:--g}" ; export VALAFLAGS ;
RUSTFLAGS="${RUSTFLAGS:--Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Clink-arg=-Wl,-z,relro -Clink-arg=-Wl,-z,now --cap-lints=warn}" ; export RUSTFLAGS ;
LDFLAGS="${LDFLAGS:--Wl,-z,relro -Wl,--as-needed -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 }" ; export LDFLAGS ;
LT_SYS_LIBRARY_PATH="${LT_SYS_LIBRARY_PATH:-/usr/lib:}" ; export LT_SYS_LIBRARY_PATH ;
CC="${CC:-gcc}" ; export CC ;
CXX="${CXX:-g++}" ; export CXX
<mock-chroot> sh-5.2$ eval $(rpm --eval '%set_build_flags')
<mock-chroot> sh-5.2$ python3 setup.py build
...
building 'Cython.Compiler.Parsing' extension
gcc -fno-strict-overflow -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -fcf-protection -fexceptions -fcf-protection -fexceptions -fcf-protection -fexceptions -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Werror=implicit-function-declaration -Werror=implicit-int -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m32 -march=i686 -mtune=generic -msse2 -mfpmath=sse -mstackrealign -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fPIC -I/usr/include/python3.12 -c /builddir/cython/Cython/Compiler/Parsing.c -o build/temp.linux-i686-cpython-312/builddir/cython/Cython/Compiler/Parsing.o
...
Notes:
--no-bootstrap-image --no-bootstrap-chroot
options are not necessary, but will be fasteri386
in fedora-rawhide-i386
is confusing, but the packages installed are i686
So far:
eval $(rpm --eval '%set_build_flags')
the it's fine.time.sleep()
. I think this is probably unrelated to the test.wait_for_waiting
.std::async
to spawn a new thread but it's pretty hard to tell because it seems to be right at the start of creating a thread before it hits anything recognisable.The offending flags that breaks it are -O2 -msse2
; without those it runs fine (and with either of those on their own it runs fine)
I'm pretty convinced there isn't a blocking Cython bug here. It's possible there's a subtle issue with the test but I'm not sure that I'm going to get to the bottom of it right now. But I'll leave this open in case anyone else can.
(The other thing to add - those 2 flags are fine on my 64 bit Linux. They also seem fine on a 32bit Opensuse virtualbox)
Thanks for investigating this. I'll downgrade the priority to "any time in 3.0.x" and will see if I can just disable the test for 3.0.4, so that it doesn't crash any more.
Describe the bug
When trying to upgrade the Fedora package from 3.0.2 to 3.0.3 I see consistent hang in tests on i686 (which is our only 32bit architecture).
I plan to bisect the problem later this week, but I decided to open this issue first in case somebody else figures it first.
Code to reproduce the behaviour:
I've attached a complete build log. I haven't yet tried to reproduce this outside of RPM environment.
Expected behaviour
The tests should pass in reasonable time. A complete build of 3.0.3 on x86_64 or 3.0.2 on i686 takes 1.5 hours on our builders. A build of 3.0.3 on i686 hangs for days.
OS
Fedora Linux 40 i686
Python version
3.12.0
Cython version
3.0.3
Additional context
No response