Open krosenfeld opened 9 years ago
Can we re-order the B-frame data before dequantizing? That would reduce the number of bits being shuffled around.
I believe we can; the reordering boundaries are conveniently located on (four?)-byte boundaries.
On Wed, Apr 29, 2015 at 3:59 PM, Katherine Rosenfeld < notifications@github.com> wrote:
Can we re-order the B-frame data before dequantizing? That would reduce the number of bits being shuffled around.
— Reply to this email directly or view it on GitHub https://github.com/sma-wideband/sdbe/issues/5#issuecomment-97564454.
@ruriktherus pointed out that that cufft callbacks would be useful here: http://devblogs.nvidia.com/parallelforall/cuda-pro-tip-use-cufft-callbacks-custom-data-processing/
This does not appear to be (easily) supported by scikits.cuda or pycuda, but the examples from the CUDA toolkit and parallelforall repo do run on hamster.
The callbacks work element-by-element? We may be working on 32bits (16 samples) at a time and unpacking into 16 floats.
I believe that callbacks are called on a per-element basis.
Transform SWARM rate frequency domain data to R2DBE rate time domain data.