Open joesavage opened 5 years ago
Thank you for the detailed report. So far, I couldn't reproduce the problem. I don't have the exact same version of gcc
you are using. Before I try to install 7.3.0, could you tell the architecture you are running your build on? The real value of -march=native
will depend on it.
Did you try to add -fno-reorder-blocks-and-partition
to gcc
flags? In certain cases it might help.
Lastly, I've noticed that the created executable is stripped. Could you modify the makefile to not strip the binary?
Sure, I'm running on a Xeon W-2195 and have pasted the output of a few exploratory gcc
commands below.
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.3.0-27ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)
$ gcc -march=native -E -v - </dev/null 2>&1 | grep cc1
/usr/lib/gcc/x86_64-linux-gnu/7/cc1 -E -quiet -v -imultiarch x86_64-linux-gnu - -march=knl -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi -mno-sgx -mbmi2 -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mrtm -mhle -mrdrnd -mf16c -mfsgsbase -mrdseed -mprfchw -madx -mfxsr -mxsave -mxsaveopt -mavx512f -mno-avx512er -mavx512cd -mno-avx512pf -mno-prefetchwt1 -mclflushopt -mxsavec -mxsaves -mavx512dq -mavx512bw -mavx512vl -mno-avx512ifma -mno-avx512vbmi -mno-avx5124fmaps -mno-avx5124vnniw -mclwb -mno-mwaitx -mno-clzero -mno-pku -mno-rdpid --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=25344 -mtune=generic -fstack-protector-strong -Wformat -Wformat-security
After passing --disable-strip
to the configure script and adding -fno-reorder-blocks-and-partition
to CFLAGS
and CXXFLAGS
, I get the following output:
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x600000, offset 0x600000
BOLT-INFO: disabling -align-macro-fusion in non-relocation mode
BOLT-INFO: forcing -jump-tables=move as PIC jump table was detected in function _ZN5boost2io6detail22parse_printf_directiveIcSt11char_traitsIcESaIcEN9__gnu_cxx17__normal_iteratorIPKcNSt7__cxx1112basic_stringIcS4_S5_EEEESt5ctypeIcEEEbRT2_RKSG_PNS1_11format_itemIT_T0_T1_EERKT3_mh
llvm-bolt: ../tools/llvm-bolt/src/BinaryFunction.cpp:1477: void llvm::bolt::BinaryFunction::postProcessJumpTables(): Assertion `I > 1 && "jump table with a size smaller than 1 detected"' failed.
#0 0x000056492a2283a5 llvm::sys::PrintStackTrace(llvm::raw_ostream&) path_to_llvm/llvm/build/../lib/Support/Unix/Signals.inc:398:0
#1 0x000056492a228438 PrintStackTraceSignalHandler(void*) path_to_llvm/llvm/build/../lib/Support/Unix/Signals.inc:462:0
#2 0x000056492a22660d llvm::sys::RunSignalHandlers() path_to_llvm/llvm/build/../lib/Support/Signals.cpp:49:0
#3 0x000056492a227c11 SignalHandler(int) path_to_llvm/llvm/build/../lib/Support/Unix/Signals.inc:252:0
#4 0x00007f8d5309a890 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12890)
#5 0x00007f8d51f68e97 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x3ee97)
#6 0x00007f8d51f6a801 abort (/lib/x86_64-linux-gnu/libc.so.6+0x40801)
#7 0x00007f8d51f5a39a (/lib/x86_64-linux-gnu/libc.so.6+0x3039a)
#8 0x00007f8d51f5a412 (/lib/x86_64-linux-gnu/libc.so.6+0x30412)
#9 0x0000564928e63002 llvm::bolt::BinaryFunction::postProcessJumpTables() path_to_llvm/llvm/build/../tools/llvm-bolt/src/BinaryFunction.cpp:1478:0
#10 0x0000564928e62d04 llvm::bolt::BinaryFunction::disassemble(llvm::ArrayRef<unsigned char>) path_to_llvm/llvm/build/../tools/llvm-bolt/src/BinaryFunction.cpp:974:0
#11 0x0000564928fab975 llvm::bolt::RewriteInstance::disassembleFunctions() path_to_llvm/llvm/build/../tools/llvm-bolt/src/RewriteInstance.cpp:2525:0
#12 0x0000564928fa02b4 operator() path_to_llvm/llvm/build/../tools/llvm-bolt/src/RewriteInstance.cpp:1007:0
#13 0x0000564928fa02b4 llvm::bolt::RewriteInstance::run()::'lambda'(std::set<unsigned long, std::less<unsigned long>, std::allocator<unsigned long> > const&)::operator()(std::set<unsigned long, std::less<unsigned long>, std::allocator<unsigned long> > const&) const (path_to_llvm/llvm/build/bin/llvm-bolt+0x35a2b4)
#14 0x0000564928fa0558 llvm::bolt::RewriteInstance::run() path_to_llvm/llvm/build/../tools/llvm-bolt/src/RewriteInstance.cpp:1035:0
#15 0x0000564928e0bdf1 main path_to_llvm/llvm/build/../tools/llvm-bolt/src/llvm-bolt.cpp:312:0
#16 0x00007f8d51f4bb97 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b97)
#17 0x0000564928e0a5aa _start (path_to_llvm/llvm/build/bin/llvm-bolt+0x1c45aa)
Stack dump:
0. Program arguments: path_to_llvm/llvm/build/bin/llvm-bolt build/bin/povray -o build/bin/povray.bolt
Aborted (core dumped)
And with -debug
:
<snip>
Checking for PIC jump table
checking potential PIC jump table
BOLT-DEBUG: addressed memory is 0x2a1b00
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b00, which contains value 75668
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b04, which contains value 75668
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b08, which contains value 75580
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b0c, which contains value 755a8
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b10, which contains value 755a8
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b14, which contains value 754a0
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b18, which contains value 754a0
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b1c, which contains value 75548
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b20, which contains value 75548
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b24, which contains value 755a8
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b28, which contains value 755a8
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b2c, which contains value 754d7
BOLT-DEBUG: indirect jmp at 0x7549e is referencing address 0x2a1b30, which contains value 75960
BOLT-DEBUG: creating jump table JUMP_TABLE/_ZN3vfe15VirtualFrontEnd4StopEv.0 in function _ZN3vfe15VirtualFrontEnd4StopEv with 12 entries.
BOLT-DEBUG: truncating jump table JUMP_TABLE/_ZN3vfe15VirtualFrontEnd4StopEv.0 at index 0 containing offset 0x208
llvm-bolt: ../tools/llvm-bolt/src/BinaryFunction.cpp:1477: void llvm::bolt::BinaryFunction::postProcessJumpTables(): Assertion `I > 1 && "jump table with a size smaller than 1 detected"' failed.
#0 0x0000560d307233a5 llvm::sys::PrintStackTrace(llvm::raw_ostream&) path_to_llvm/llvm/build/../lib/Support/Unix/Signals.inc:398:0
#1 0x0000560d30723438 PrintStackTraceSignalHandler(void*) path_to_llvm/llvm/build/../lib/Support/Unix/Signals.inc:462:0
#2 0x0000560d3072160d llvm::sys::RunSignalHandlers() path_to_llvm/llvm/build/../lib/Support/Signals.cpp:49:0
#3 0x0000560d30722c11 SignalHandler(int) path_to_llvm/llvm/build/../lib/Support/Unix/Signals.inc:252:0
#4 0x00007fdb870a5890 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12890)
#5 0x00007fdb85f73e97 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x3ee97)
#6 0x00007fdb85f75801 abort (/lib/x86_64-linux-gnu/libc.so.6+0x40801)
#7 0x00007fdb85f6539a (/lib/x86_64-linux-gnu/libc.so.6+0x3039a)
#8 0x00007fdb85f65412 (/lib/x86_64-linux-gnu/libc.so.6+0x30412)
#9 0x0000560d2f35e002 llvm::bolt::BinaryFunction::postProcessJumpTables() path_to_llvm/llvm/build/../tools/llvm-bolt/src/BinaryFunction.cpp:1478:0
#10 0x0000560d2f35dd04 llvm::bolt::BinaryFunction::disassemble(llvm::ArrayRef<unsigned char>) path_to_llvm/llvm/build/../tools/llvm-bolt/src/BinaryFunction.cpp:974:0
#11 0x0000560d2f4a6975 llvm::bolt::RewriteInstance::disassembleFunctions() path_to_llvm/llvm/build/../tools/llvm-bolt/src/RewriteInstance.cpp:2525:0
#12 0x0000560d2f49b2b4 operator() path_to_llvm/llvm/build/../tools/llvm-bolt/src/RewriteInstance.cpp:1007:0
#13 0x0000560d2f49b2b4 llvm::bolt::RewriteInstance::run()::'lambda'(std::set<unsigned long, std::less<unsigned long>, std::allocator<unsigned long> > const&)::operator()(std::set<unsigned long, std::less<unsigned long>, std::allocator<unsigned long> > const&) const (path_to_llvm/llvm/build/bin/llvm-bolt+0x35a2b4)
#14 0x0000560d2f49b558 llvm::bolt::RewriteInstance::run() path_to_llvm/llvm/build/../tools/llvm-bolt/src/RewriteInstance.cpp:1035:0
#15 0x0000560d2f306df1 main path_to_llvm/llvm/build/../tools/llvm-bolt/src/llvm-bolt.cpp:312:0
#16 0x00007fdb85f56b97 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b97)
#17 0x0000560d2f3055aa _start (path_to_llvm/llvm/build/bin/llvm-bolt+0x1c45aa)
Stack dump:
0. Program arguments: path_to_llvm/llvm/build/bin/llvm-bolt build/bin/povray -debug -o build/bin/povray.bolt
Aborted (core dumped)
I've been running some tests with BOLT and the SPEC CPU2017 benchmarks recently and have encountered a couple of issues. While I can't currently seem to reproduce the main issue I'm experiencing with
511.povray_r
(where the rewritten binary segfaults if-Wl,-q
is added toOPTIMIZE
when compiling) using the open source version of povray, I thought I'd report an issue I ran into along the way that seems to apply to quite a few of the workloads I've tested.Essentially, when compiling with
-march=native
using gcc 7.3.0 on Ubuntu 7.3.0, I frequently hit the assertion inBinaryFunction.cpp
about jump tables with a size smaller than one. This is reproducable, for instance, with the following steps:Modifying povray's
configure
script to remove the-march=native
flag seems to resolve this issue. In the case where this flag is present, though, it would be preferable if BOLT just bailed on the function rather than crashing completely.For reference, here's the output I get along with the crash:
And with
-debug
: