open-power-sdk / pveclib

Power Vector Library
Apache License 2.0
29 stars 8 forks source link

P9 Float128 add/sub with round to odd, plus P8 equivalent. #159

Closed munroesj52 closed 2 years ago

munroesj52 commented 2 years ago

Power9 provide round to odd versions for the QP arithmetic operations. This patch provides P9/8 implementation for add/sub QP with round to odd. Includes compile, unit, and performance tests.

* src/pveclib/vec_f128_ppc.h [doxygen brief]:
Added micro-benchmark data. Other clean up.
[doxygen f128_softfloat_0_0]: Clarifications.
[doxygen f128_softfloat_0_0_0]: Clarifications.
[doxygen f128_softfloat_IRRN_0_0]: Clarifying Note.
[doxygen f128_softfloat_IRRN_0_1]:
Improve internal representation (IR) overview.
[doxygen f128_softfloat_IRRN_0_2]:
Expond on how the IR impacts rounding.
[doxygen f128_softfloat_0_0_3_2]:
Add overview of Add Quad-Precision with Round-to-Odd.
[doxygen f128_softfloat_0_0_3_3]:
Add overview of Subract Quad-Precision with Round-to-Odd.
[__clang__]: Clang keeps changing, tried to fix it.
(vec_negf128 [__FLOAT128__) && (__GNUC__ > 7]:
Use C arithmetic syntax for negate.
(vec_xsaddqpo, vec_xssubqpo): New inline operations.
(vec_xsmulqpo): Update latency numbers.

* src/testsuite/arith128_test_f128.c
(db_vec_xsaddqpo, db_vec_xssubqpo): New debug implementations.
(test_add_qpo, test_sub_qpo): New unit test functions.
(test_sub_qpo): Add test_add_qpo and test_sub_qpo to test
driver.

* src/testsuite/pveclib_perf.c (test_time_f128):
Add timed tests timed_gcc_addqpn_f128, timed_lib_addqpo_f128,
timed_gcc_subqpn_f128, and timed_lib_subqpo_f128.

* src/testsuite/vec_f128_dummy.c (force_eMin, force_eMin_V0,
test_vec_xsaddqpo, test_vec_xssubqpo, test_genqpo_v0,
test_vec_addqpo, test_vec_addqpo_V1, test_vec_addqpo_V0,
test_negqp_nan_v0, test_vec_subqpo, test_vec_subqpo_V0):
New compile tests. Some are used in unit and perf tests.
(test_gcc_addqpn_f128, test_gcc_subqpn_f128):
New combined compile and perf operation tests.
(test_gcc_mulqpn_f128, test_gcc_mulqpo_f128): Fix
combined compile and perf operation tests.

* src/testsuite/vec_perf_f128.c (test_gcc_addqpn_f128,
test_gcc_subqpn_f128, test_vec_addqpo, test_vec_subqpo):
New externs.
(test_lib_addqpo_f128, test_lib_subqpo_f128):
New inner perf test.
(timed_lib_addqpo_f128, timed_lib_subqpo_f128,
timed_gcc_addqpn_f128, timed_gcc_subqpn_f128):
New timed perf tests.

* src/testsuite/vec_perf_f128.h (timed_gcc_addqpn_f128.
timed_lib_addqpo_f128, timed_gcc_subqpn_f128,
timed_lib_subqpo_f128): New externs.

*src/testsuite/vec_pwr10_dummy.c
[(__GNUC__ == 11) && (__GNUC_MINOR__ > 2)]:
Waiting for fixes in GCC 11.3?

* src/testsuite/vec_pwr9_dummy.c (test_negqp_PWR9):
New compile test.
Various -Wall clean-ups.

Signed-off-by: Steven Munroe munroesj52@gmail.com

munroesj52 commented 2 years ago

No comments in over 5 weeks so need to move on.