Lokathor / wide

A crate to help you go wide. By which I mean use SIMD stuff.
https://docs.rs/wide
zlib License
224 stars 22 forks source link

u16x16 #59

Closed RazrFalcon closed 1 month ago

RazrFalcon commented 3 years ago

This one fits into 256bits, so it can be implemented via AVX.

Lokathor commented 3 years ago

256-bit support has taken a back seat compared to 128-bit types simply for lack of time to do everything at once, but this would be welcomed as a PR.

ronniec95 commented 3 years ago

@RazrFalcon I might be able to do this, but where is something like this useful? I'm genuinely curious.

RazrFalcon commented 3 years ago

@ronniec95 I'm using it in my Skia port: https://github.com/RazrFalcon/tiny-skia/tree/master/src/wide

Right now, I have only a scalar implementation. And I would like to replace the custom SIMD implementation with an existing one, like this crate.

gilescope commented 3 years ago

I had a little look at this but was puzzled that the unsigned types seem to use signed add instructions: add_i16_m128i - I can see there's saturating unsigned adds we could use: _mm_adds_epu16. Is it that add_i16_m128i works for both? If so why do they have signed / unsigned in their naming?

Lokathor commented 3 years ago

for wrapping addition, which is what the integer types in wide use, signed and unsigned uses the same bit manipulation, and thus the same hardware instruction.

gilescope commented 2 years ago

Ah this should not have been closed as this one is not yet implemented.

Lokathor commented 2 years ago

oops!

mcroomp commented 1 month ago

155 fixes this