arduano / simdeez

easy simd
MIT License
332 stars 25 forks source link

more i16 and u16 operations #33

Open lovasoa opened 4 years ago

lovasoa commented 4 years ago

Hello, I was hoping I could use this crate for https://github.com/image-rs/jpeg-decoder/pull/146 , but it would require several functions that are not implemented:

jackmott commented 4 years ago

On the unsigned 16s, do you need all the add/sub etc functions for those too? There isn't any u16 stuff at the moment.

jackmott commented 4 years ago

I've added the load i16, the cast I need to do some research on how to do it in sse2 efficiently. PRs welcome by the way if you want to add these yourself. I can help you get started if you don't know where to begin.

lovasoa commented 4 years ago

Thanks for implementing load_i16 ! The code I'm working on (JPEG IDCT) just loads i16 and u16, then casts them to i32, then performs arithmetic, so it doesn't need any operation on i16 and u16...

Yes, I realized there aren't any u16 function at the moment after I created the issue. It's too bad because the JPEG quantization table is made of u16s...

I may try to open a PR if I find the time (and the motivation 😬 ).