AOSC-Archive / autobuild3

AOSC OS package maintenance toolkit (version 3)
https://aosc.io
GNU General Public License v2.0
24 stars 17 forks source link

RFC: turn on fun & safe math optimizations for NEON #108

Open Artoria2e5 opened 4 years ago

Artoria2e5 commented 4 years ago

https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html If the selected floating-point hardware includes the NEON extension (e.g. -mfpu=neon), note that floating-point operations are not generated by GCC’s auto-vectorization pass unless -funsafe-math-optimizations is also specified. This is because NEON hardware does not fully implement the IEEE 754 standard for floating-point arithmetic (in particular denormal values are treated as zero), so the use of NEON instructions may lead to a loss of precision.

Icenowy says v8 neon is ok, but we are still doing arm7hf so... (also, do people really care about denormals?)

KexyBiscuit commented 4 years ago

Add Armv8-A AArch64 as a new arch. is an option, isn't it?

Icenowy commented 4 years ago

@KexyBiscuit our arm64 port is already ARMv8-A AArch64.

@Artoria2e5 I think accuracy is still a problem. Maybe it can be an option that can be enabled per package.

KexyBiscuit commented 4 years ago

@Icenowy What a typo... I meant AArch32.

Icenowy commented 4 years ago

@KexyBiscuit for ARMv8 AArch32, this port is nearly useless -- the only suitable ARM Cortex(R) core is Cortex-A32.

Artoria2e5 commented 4 years ago

Test case: https://news.ycombinator.com/item?id=13244168

No, even with ARMv8-a set as -march gcc does not give a shit about neon instrinsics without fun & safe. Instrinsics!

KexyBiscuit commented 4 years ago

@Icenowy It's useful for those devices using AArch64 processor but with AArch32 OS, like Rasp. Pi.

Icenowy commented 4 years ago

@KexyBiscuit currently I don't know any other device that officially support AArch32 kernel on AArch64 CPU -- except for some Qualcomm ones which we may never be able to support.

Artoria2e5 commented 4 years ago

事实的处理中会管 denormal 的人真的很少,甚至连 Intel 的编译器在不是 O0 的时候都会把它摘掉……真的这个精度没有人会管。