Closed atrosinenko closed 4 years ago
Now, according to some fuzzing that compares __divdf3 results with ones obtained via native x86_64 CPU instructions, it correctly computes division result with the default rounding for float64. While the original question was about float128, current implementation is generic for all three sizes tuned with two defines: iterations count performed with half and full integer width.
I have restored that missing #include
. There was another serious issue with this PR: the __divXf3
function was not static
, so it was defined in up to three different object files. Most probably, this is an UB but it would be surprising for me if it can lead to undefined symbol
...
Just squashed into two commits. The next goal is to improve both commits by rewriting "in place".
Just as a note, after ~130M of iterations, test_divtf3 failed with:
a = 0x7FFF3463FAB38A79C8F703404AE51275
b = 0xFFFFEBB15CC76E12857BCAFA1D8BC48C
x = 0x7FFFB463FAB38A79C8F703404AE51275
expected = 0xFFFFEBB15CC76E12857BCAFA1D8BC48C
As I understand, it is just some NaN that should be normally ignored by compareResultLD
.
Now, the "NFC refactoring commit" passes basic unit tests but requires further proof revalidation and fuzzing. Still, it is useful for the purpose of performance debugging.
Another float128 issue now not NaN-related:
a = 0xFFFF630AA1E6ED1F319D498544DB38EA
b = 0x7FFFDD22D65546B177698C9A7176766A
x = 0xFFFFE30AA1E6ED1F319D498544DB38EA
expected = 0x7FFFDD22D65546B177698C9A7176766A
UPD: Probably another NaN mis-handling by fuzzer code. My implementation (actually, the pre-existing part) returned a
turned into quiet NaN. This seems to be according to the standard.
Reviewed on Phabricator:
Uploaded to mainline:
First, try aligning all three softfp division implementations.
My plan is to gradually make these three implementations textually as similar as possible. After that, extract them to
fp_div_impl.inc
file like it is already done for__mul[sdt]f3
helpers.When comparing them I have found some quite strange differences and I'm not sure whether it is intentional optimization or just a code that was copy-pasted and incorrectly updated. Unit tests for softfp division contain just two test cases each. Were they thoroughly debugged at all?..