Closed stuartarchibald closed 4 years ago
This should be fixed by
Is this problem still present in 0.15.0?
Looks like we may need to disable SSE4.2 (which also disables SSE4.1 I think) in the conda-forge builds
@kszucs @pitrou @xhochy
Thanks for looking at this. I've got 0.15.0 locally, seems like there's no issue in site-packages/pyarrow/*.so
any more but there are still problems in lib
:
$ conda list|grep arrow
# packages in environment at <redacted>/_tmp_pyarrow_bad:
arrow-cpp 0.15.0 py37h090bef1_1 conda-forge
pyarrow 0.15.0 py37h8b68381_1 conda-forge
(_tmp_pyarrow_bad)
$ for x in $(find $(dirname `which python`)/../lib/*arrow*.so); do echo $x; objdump -D $x|grep pinsrq|head -1; done
<redacted>/_tmp_pyarrow_bad/bin/../lib/libarrow_dataset.so
15c49: 66 48 0f 3a 22 c0 01 pinsrq $0x1,%rax,%xmm0
<redacted>/_tmp_pyarrow_bad/bin/../lib/libarrow_flight.so
66703: 66 48 0f 3a 22 c2 01 pinsrq $0x1,%rdx,%xmm0
<redacted>/_tmp_pyarrow_bad/bin/../lib/libarrow_python.so
42114: 66 48 0f 3a 22 05 e1 pinsrq $0x1,0xe6ce1(%rip),%xmm0 # 128e00 <_ZNSt17_Function_handlerIFSt10unique_ptrINSt13__future_base12_Result_baseENS2_8_DeleterEEvENS1_12_Task_setterIS0_INS1_7_ResultIN5arrow6StatusEEES3_EZNS1_11_Task_stateISt5_BindIFZNS8_2py21DataFrameBlockCreator18WriteTableToBlocksEvEUliE_iEESaIiEFS9_vEE6_M_runEvEUlvE_S9_EEE9_M_invokeERKSt9_Any_data@@Base+0xdd6a0>
<redacted>/_tmp_pyarrow_bad/bin/../lib/libarrow.so
1ef465: 66 48 0f 3a 22 c0 01 pinsrq $0x1,%rax,%xmm0
(_tmp_pyarrow_bad)
$ gdb -ex='r' -ex 'display/i $pc' --args python -c 'import pyarrow'
<snip>
Reading symbols from <redacted>/_tmp_pyarrow_bad/bin/python3.7...done.
Starting program: <redacted>/_tmp_pyarrow_bad/bin/python -c import\ pyarrow
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffee879700 (LWP 5155)]
[New Thread 0x7fffee078700 (LWP 5156)]
[New Thread 0x7fffeb877700 (LWP 5157)]
Program received signal SIGILL, Illegal instruction.
0x00007fffe56c7cf6 in arrow::internal::CreateGlobalRegistry() ()
from <redacted>/_tmp_pyarrow_bad/lib/python3.7/site-packages/pyarrow/../../../libarrow.so.15
1: x/i $pc
=> 0x7fffe56c7cf6 <_ZN5arrow8internalL20CreateGlobalRegistryEv+166>: pinsrq $0x1,%rbx,%xmm0
Yes, we shouldn't build here with SSE4.2. Made a PR: https://github.com/conda-forge/arrow-cpp-feedstock/pull/106
Thanks @xhochy
Have we run any benchmarks before doing this? I hope this doesn't disable the HW popcount optimization.
Have we run any benchmarks before doing this? I hope this doesn't disable the HW popcount optimization.
Assuming you mean popcnt
? Seems like Nehalem or later is needed i.e. SSE4+ . Would guess the above would disable it?
That depends how we do it exactly in our source code. I don't recall right now, need to check (I can do so Monday).
Actually, think I applied the wrong logic there, popcnt
is not part of SSE, it just appeared around the time of SSE4. Given nocona
should be the target instruction set it won't have SSE4+ or popcnt
.
Given nocona should be the target instruction set
Is that a hard constraint? Having a fast popcnt
is rather important for Arrow...
(or we'll have to implement a runtime switch :-/)
It's from the Anaconda toolchain:
$ conda create -n _tmp_gcc2 gcc_linux-64 -q -y
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done
## Package Plan ##
environment location: <path>/envs/_tmp_gcc2
added / updated specs:
- gcc_linux-64
The following NEW packages will be INSTALLED:
_libgcc_mutex pkgs/main/linux-64::_libgcc_mutex-0.1-main
binutils_impl_lin~ pkgs/main/linux-64::binutils_impl_linux-64-2.31.1-h6176602_1
binutils_linux-64 pkgs/main/linux-64::binutils_linux-64-2.31.1-h6176602_8
gcc_impl_linux-64 pkgs/main/linux-64::gcc_impl_linux-64-7.3.0-habb00fd_1
gcc_linux-64 pkgs/main/linux-64::gcc_linux-64-7.3.0-h553295d_8
libgcc-ng pkgs/main/linux-64::libgcc-ng-9.1.0-hdf63c60_0
libstdcxx-ng pkgs/main/linux-64::libstdcxx-ng-9.1.0-hdf63c60_0
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
$ conda activate _tmp_gcc2
$ echo $CFLAGS
-march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pip
Ok, these are the technical settings, but what is the policy? Is it possible for a package to require a later ISA extension? @msarahan may know the answer.
Perhaps should
implies too much, nocona is the default target instruction set
.
@pitrou Currently conda-forge builds all packages for nocona
. There is the option to require newer features, e.g. a newer glibc
. The current approach for that in conda
are the newly introduced "virtual packages" (yet only cuda and glibc are handled that way) but would also be a way to build conda packages by SSE flavour.
Issue:
For x86_64 linux some of the pyarrow extension libraries contain instructions from an instruction set greater than
nocona
(default for the Anaconda toolchain compilers).Reproducer:
should yield output like:
The instruction
pinsrq
is SSE 4.1+,nocona
supports MMX, SSE, SSE2 and SSE3. The effect is aSIGILL
on attempted load from a CPU without SSE 4.1.xref: https://github.com/AnacondaRecipes/pyarrow-feedstock/issues/1
Environment (
conda list
):Details about
conda
and system (conda info
):