Closed AlexanderWells-diamond closed 1 year ago
I suspect this crash is due to an ABI issue. It appears to me that the pythonSoftIOC CI build is being done in quay.io/pypa/manylinux2014_x86_64
, and is building p4p 4.1.7 from source. The corresponding wheels for epicscorelibs and pvxslibs were built with manylinux1
.
epicscorelibs-7.0.7.99.0.1-cp37-cp37m-manylinux1_x86_64.whl
pvxslibs-1.2.1-cp37-cp37m-manylinux1_x86_64.whl
p4p-4.1.7.tar.gz
Is it possible to persuade cibuildwheel
to call pip with --only-binary epicscorelibs,pvxslibs,p4p
?
This would make future occurrences more obvious.
As for why the 3.7 linux/x86_64 wheel wasn't being used? Well, it wasn't there.
GHA was on the blink the day I did the release, with jobs being "cancelled" randomly. I though I had manually re-run these jobs to the point that all had passed, but it looks like I stopped with two left uncompleted. 3.5 and 3.7 linux/x86_64. oops...
I've run them now and verified that p4p-4.1.7-cp37-cp37m-manylinux1_x86_64.whl
has been uploaded.
Some details which turned out to be irrelevant...
Current thread 0x00007f43788d9740 (most recent call first): File "/tmp/tmp.8MU0smYiyK/venv/lib/python3.7/site-packages/p4p/nt/common.py", line 18 in
A test run using https://github.com/mdavidsaver/ci-core-dumper/pull/1 gives a bit more information. Which is a fair argument for that PR being good enough.
Thread 1 (Thread 0x7fe59b9f8740 (LWP 445)):
#0 0x00007fe59b6114fb in raise () from /lib64/libpthread.so.0
#1 <signal handler called>
#2 0x00007fe58d7f2c04 in pvxs::TypeDef::_append(pvxs::Member&, pvxs::Member const&) () from /tmp/tmp.4ZFQcoKe98/venv/lib/python3.7/site-packages/p4p/../pvxslibs/lib/libpvxs.so.1.2
#3 0x00007fe58dae848c in p4p::appendPrototype(pvxs::TypeDef&, _object*) () from /tmp/tmp.4ZFQcoKe98/venv/lib/python3.7/site-packages/p4p/_p4p.cpython-37m-x86_64-linux-gnu.so
#4 0x00007fe58dace635 in __pyx_pw_3p4p_4_p4p_5_Type_1__init__(_object*, _object*, _object*) () from /tmp/tmp.4ZFQcoKe98/venv/lib/python3.7/site-packages/p4p/_p4p.cpython-37m-x86_64-linux-gnu.so
...
Thank you Michael, our CI now passes with the prebuilt wheel.
There unfortunately seems to be no simple way to pass commands through cibuildwheel
to pip wheel
itself - the obvious one is to set CIBW_ENVIRONMENT: PIP_ONLY_BINARY=epicscorelibs,pvxslibs,p4p
as per the pip docs, but for reasons unknown it isn't picked up. So we'll unfortunately just have to watch out for this sort of issue in the future.
fyi. I have put up a set of release candidate builds:
One of the changes made in setuptools-dso
and epicscorelibs
should partially address the manylinux vs. host ABI issue seen here.
Now, for GCC builds, the value of -D_GLIBCXX_USE_CXX11_ABI=...
is latched when building epicscorelibs
, and propagated via. epicscorelibs.config:get_config_var('CPPFLAGS')
. With this change, if a build pulls in a pypi.org wheel for epicscorelibs, then any local build of p4p and maybe pvxslibs will keep the same value of _GLIBCXX_USE_CXX11_ABI
used by the manylinux builds (currently 0
for all images), instead of the local default (1
for gcc >= 5.1).
Of course the "Dual ABI" difference may not be the only source of issues, though I think it is the most common.
It seems the newest release of
p4p
,4.1.7
, is causing a segmentation fault in tests forpythonSoftIoc
, as seen here.Unfortunately I am unable to recreate this issue locally - I cannot install
p4p==4.1.7
due to an apparent conflict with Numpy that I cannot explain as I'm using the same version (1.21.6) as our CI uses.The issue occurs at this line of our test:
And here's the call stack that the segfault printed:
Please let me know if there's any more information we can provide.