llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.61k stars 11.82k forks source link

--march=native misidentification #17095

Closed llvmbot closed 2 years ago

llvmbot commented 11 years ago
Bugzilla Link 16721
Resolution FIXED
Resolved on Jul 29, 2013 06:03
Version 3.3
OS Linux
Reporter LLVM Bugzilla Contributor
CC @d0k,@DimitryAndric

Extended Description

Compiling with --march=native on one of my machines (a core 2 era Pentium Dual-Core) misidentifies the chip as penryn instead of core2 (the proper -march flag for this CPU)

Here's the output of gcc and clang for comparison, as well as /proc/cpuinfo for reference.

: | gcc -v -E -march=native - Using built-in specs. COLLECT_GCC=/usr/x86_64-pc-linux-gnu/gcc-bin/4.5.4/gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/4.5.4/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /var/tmp/portage/sys-devel/gcc-4.5.4/work/gcc-4.5.4/configure --prefix=/usr --bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/4.5.4 --includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/include --datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.4 --mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.4/man --infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.4/info --with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/include/g++-v4 --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --disable-altivec --disable-fixed-point --without-cloog --without-ppl --disable-lto --enable-nls --without-included-gettext --with-system-zlib --enable-obsolete --disable-werror --enable-secureplt --enable-multilib --enable-libmudflap --disable-libssp --enable-libgomp --with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/4.5.4/python --enable-checking=release --enable-libstdcxx-time --enable-objc-gc --enable-languages=c,c++,java,objc,obj-c++,fortran --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --enable-targets=all --with-bugurl=http://bugs.gentoo.org/ --with-pkgversion='Gentoo 4.5.4 p1.2, pie-0.4.7' Thread model: posix gcc version 4.5.4 (Gentoo 4.5.4 p1.2, pie-0.4.7) COLLECT_GCC_OPTIONS='-v' '-E' /usr/libexec/gcc/x86_64-pc-linux-gnu/4.5.4/cc1 -E -quiet -v - -D_FORTIFY_SOURCE=2 -march=core2 -mcx16 -msahf --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=core2 ignoring nonexistent directory "/usr/local/include" ignoring nonexistent directory "/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/../../../../x86_64-pc-linux-gnu/include"

include "..." search starts here:

include <...> search starts here:

/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/include /usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/include-fixed /usr/include End of search list.

1 ""

1 ""

1 ""

1 ""

COMPILER_PATH=/usr/libexec/gcc/x86_64-pc-linux-gnu/4.5.4/:/usr/libexec/gcc/x86_64-pc-linux-gnu/4.5.4/:/usr/libexec/gcc/x86_64-pc-linux-gnu/:/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/:/usr/lib/gcc/x86_64-pc-linux-gnu/:/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/../../../../x86_64-pc-linux-gnu/bin/ LIBRARY_PATH=/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/:/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/../../../../x86_64-pc-linux-gnu/lib/:/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.4/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '-E'

: | clang -v -E -march=native - clang version 3.3 (tags/RELEASE_33/final) Target: x86_64-pc-linux-gnu Thread model: posix "/usr/bin/clang" -cc1 -triple x86_64-pc-linux-gnu -E -disable-free -disable-llvm-verifier -main-file-name - -mrelocation-model static -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -target-cpu penryn -target-linker-version 2.23.1 -v -resource-dir /usr/bin/../lib64/clang/3.3 -internal-isystem /usr/local/include -internal-isystem /usr/bin/../lib64/clang/3.3/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdebug-compilation-dir /home/elena -ferror-limit 19 -fmessage-length 126 -mstackrealign -fobjc-runtime=gcc -fobjc-default-synthesize-properties -fdiagnostics-show-option -fcolor-diagnostics -backend-option -vectorize-loops -o - -x c - clang -cc1 version 3.3 based upon LLVM 3.3 default target x86_64-pc-linux-gnu ignoring nonexistent directory "/usr/local/include" ignoring nonexistent directory "/include"

include "..." search starts here:

include <...> search starts here:

/usr/bin/../lib64/clang/3.3/include /usr/include End of search list.

1 ""

1 "" 1

1 "" 3

162 "" 3

1 "" 1

1 "" 2

1 "" 2

cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Pentium(R) Dual-Core CPU E5300 @ 2.60GHz stepping : 10 microcode : 0xa07 cpu MHz : 2603.000 cache size : 2048 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm dtherm tpr_shadow vnmi flexpriority bogomips : 5188.99 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:

processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Pentium(R) Dual-Core CPU E5300 @ 2.60GHz stepping : 10 microcode : 0xa07 cpu MHz : 2603.000 cache size : 2048 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm dtherm tpr_shadow vnmi flexpriority bogomips : 5188.99 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:

llvmbot commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#16722

d0k commented 11 years ago

Some low-end penryns have SSE4 disabled. I fixed LLVM to check for SSE4.1 and detect those chips as core2 if it's missing in r187350.

llvmbot commented 11 years ago

Let me correct myself then, since I've been gotcha'd: your compiler is wrong about what my chip can do.

I did some more poking around, and code fails horribly with illegal instruction calls on -march=native and -march=penryn, but setting -mno-sse4.2 fixes the issue. So it seems like -march=penryn is wrongly set to believe that sse4.2 exists.

Correcting my correcting myself: -mno-sse4 is fine, -mno-sse4.2 isn't. I don't know what's up with that, since technically (based off of documentation) my chip SHOULD support sse4.1. It just apparently doesn't.

llvmbot commented 11 years ago

Let me correct myself then, since I've been gotcha'd: your compiler is wrong about what my chip can do.

I did some more poking around, and code fails horribly with illegal instruction calls on -march=native and -march=penryn, but setting -mno-sse4.2 fixes the issue. So it seems like -march=penryn is wrongly set to believe that sse4.2 exists.

DimitryAndric commented 11 years ago

Actually, your CPU is of the Penryn architecture. Please refer to:

https://en.wikipedia.org/wiki/Penryn_%28microarchitecture%29

and:

https://en.wikipedia.org/wiki/List_of_Intel_Pentium_Dual-Core_microprocessors#.22Wolfdale-3M.22_.2845_nm.29

Apparently gcc just shows this as "core2", see line 661 in this file:

http://gcc.gnu.org/viewcvs/gcc/trunk/gcc/config/i386/driver-i386.c?revision=200744&view=markup

llvmbot commented 11 years ago

Bug llvm/llvm-bugzilla-archive#16722 has been marked as a duplicate of this bug.