ridiculousfish / libdivide

Official git repository for libdivide: optimized integer division
http://libdivide.com
Other
1.09k stars 77 forks source link

Fix GCC vector alignment and aliasing issues #93

Closed adbancroft closed 2 years ago

adbancroft commented 2 years ago

Fix #90

Only occurs under -O3 on GCC x86_64 (v9.3 tested). GCC has built in SSE & AVX2 vector types as well as built in vector extensions, including subscription operator. I think the original code was clashing with this.

adbancroft commented 2 years ago

The real solution is to fully implement 16-bit vector division.

ridiculousfish commented 2 years ago

Thanks for tackling this, using unions here is the correct fix, nicely done. I didn't realize that gcc would complain about aliasing vectors through pointers-to-integers.

Regarding the second commit, I think there's no "array overrun" but instead what's happening is that our tester was dividing INT_MIN by -1 (or 64 bit analog) which is undefined behavior. Our tester generates random data and wasn't protecting against that possibility.

This is "fixed" by correctly observing strict aliasing, which implies that the INT_MIN came from the compiler reacting to the UB. The fix to use a union is good and correct, fixing the strict aliasing violations, but it does not address the underlying problem which is that we "test" -1 / INT_MIN and we should make sure that doesn't happen.

Anyways I have some nits with the comments but the overall approach of using unions is right. You can fix the -1/INT_MIN issue or leave it for later and I'll pick it up.

Please squash the two commits together (unless there really is an array overrun which I am missing).

adbancroft commented 2 years ago

The array overrun was caused by my first check in - I should have squashed that commit in my local repo. Fixed.

adbancroft commented 2 years ago

I didn't realize that gcc would complain about aliasing vectors through pointers-to-integers.

To be fair, I probably introduced this when I added 16-bit support and the SIMPLE_VECTOR_DIVISIONmacro.