gnuradio / volk

The Vector Optimized Library of Kernels
http://libvolk.org
GNU Lesser General Public License v3.0
557 stars 202 forks source link

Add 4ic deinterleave to 8i x2 #398

Open dkozel opened 4 years ago

dkozel commented 4 years ago

I think this can be done pretty efficiently, if bytewise arithmetic shift operators exist. Otherwise its probably just loop unrolling? The volk_8ic_deinterleave_16i_x2 kernels are much more complicated than I expected though so I'm probably not aware of a lot of nuances of available SIMD operations.

uint8_t input[size];
uint8_t out_1[size];
uint8_t out_2[size];

for (int i = 0; i < size; i++) {
    out_1[i] = input[i] << 4;
    out_1[i] = out_1[i] >> 4;
    out_2[i] = input[i] >> 4;
}
jdemel commented 4 years ago

So let's see, you propose a new kernel volk_4ic_deinterleave_8i_x2? Do you have a use case? I have an idea how to use such low resolution values. But I'd suggest a LUT instead of shifts. Are you willing to implement a first kernel?

dkozel commented 4 years ago

@jdemel Yes, I'm writing blocks for a Radio Astronomy acquisition board which stream packed signed 4bit IQ data. Yes. I'll put up a PR shortly with nearly complete generic and SSE2 kernels, though I have some uncertainty about dispatchers and input datatypes as there isn't a native 4bit type in C++.