Lokathor / wide

A crate to help you go wide. By which I mean use SIMD stuff.
https://docs.rs/wide
zlib License
288 stars 24 forks source link

Implement variable shift left/right and ease of test coverage with generic trait #157

Closed mcroomp closed 4 months ago

mcroomp commented 4 months ago

Functionality: Add new shl / shr to u32x8 and u32x4 that shift by the corresponding number right hand SIMD lane. This is implemented efficiently in AVX2 and Neon. Useful for dividing by constants via algorithms like libdivide. Added example implementation of branch free divide in t_usefulness.

Bug fixes: Better testing exposed bug u32x8::max and u32x8::min on AVX2 which were calling the signed versions instead of unsigned.

Testing improvements: Rather than manually calculate the correct scalar scalar output to verify SIMD operations, add a trait similar to the portable Simd library that implements the basic to/from so that the test code can run as a generic instead of copy/pasted.

I can add this to the other types if you think this is a useful enhancement, but I didn't want to do too much before your getting feedback.