open-power-sdk / pveclib

Power Vector Library
Apache License 2.0
29 stars 8 forks source link

Compiler bug in vec_cpsgn reverses operands. #158

Open munroesj52 opened 2 years ago

munroesj52 commented 2 years ago

Seems the GCC (and Clang followed) initially reversed the operands for vec_cpsgn() (so the sign is copied from operand b into a).

This differs from the Intrinsic reference and the ISA say the sign is copied from operand a into b.

Unfortunately PVECLIB followed the original vec_cpsgn() for the implementation of: vec_copysignf32(), vec_copysignf64(), and vec_copysignf128()

This has been reported as a bug and now GCC and Clang are in the process of "fixing" this bug to match the Intrinsic reference guide. The fix will be applied to currently supported versions but older compiler version will remain unchanged.

PVECLIBs vec_copysignf32() and vec_copysignf64() used vec_cpsgn() as the implementation for __ARCH_PWR7 and above. So applications may see incompatible changes depending on the compiler used. vec_copysignf128() will retain the original implementation and operand order.

So obviously PVECLIB should insure that applications get consistent results across compiler versions. But should PVECLIB retain the existing (self consistent and documented) operand order. Or change to match intended order defined in the Intrinsic reference and the ISA?

ThinkOpenly commented 2 years ago

I vote for being consistent with the intended / fixed behavior of the ISA and Intrinsics reference. The barrier to entry/exit from using pveclib should be low, so matching future / long-term behavior is preferred.

munroesj52 commented 2 years ago

This is a configuration nightmare. The compile changes to fix vec_cpsgn will be applied to GCC 9 and later (clang12 and later), but not earlier versions. The fixes will take awhile to get accepted by GCC/Clang community, then the distro's, who will schedule and distribute updates, and finally applied on the users system. Clocked in random months.

So we need a fail-safe solution that will provide the expected/correct result for:

I am thinking something like this:

if _ARCH_PWR7

ifdef PVECLIB_CPSGN_FIXED

return (vec_cpsgn (vf32x, vf32y));

else

vf32_t result; asm( "xvcpsgnsp %x0,%x1,%x2;\n" : "=wa" (result) : "wa" (vf32x), "wa" (vf32y) :); return (result);

endif

else

...

endif

This assumes that any compiler that defines _ARCH_PWR7 will assemble xvcpsgnsp/xvcpsgndp. So if PVECLIB_CPSGN_FIXED is not defined we will generate reasonable code that is correct but perhaps not as optimal as the vec_cpsgn built-in would generate.

I can add a configure test the will compile and test vec_cpsgn() for correct results and define PVECLIB_CPSGN_FIXED for the PVECLIB build. But unless applications use this (or equivalent) configure test, that application will get the fail-safe code.

The alternative is to add some really complicated cpp logic to vec_common_ppc.h (or perhaps add a vec_config_ppc.h) that defines PVECLIB_CPSGN_FIXED if safe and appropriate. This requires checking multiple compiler releases for specific major and patch levels. Most of which we will not know until I see specific compilers released into the wild and can test them.