Closed d-meiser closed 11 years ago
Hello Dominic,
The FFT functionality was contributed by @neapel, so I'll address this question to him.
The interface supports multidimensional transforms, so I think adding batch transforms should be relatively easy. But I don't have much experience with FFT, so I am not sure if this is true.
Cheers, Denis.
It looks like multidimensional FFT does row-wise 1D FFTs with consequent transpose. I guess batch FFT should do the same thing with transposition omitted. Should be easy to add the functionality.
Currently you can only specify one transformation direction to be used for all dimensions, I'll add a way to use a list of directions instead, so that FFT({128, 32}, {forward, none})
calculates 32 transformations of length 128, for each row, while FFT({128, 32}, forward)
is the same as FFT({128, 32}, {forward, forward})
i.e. a 2D forward FFT.
That's fantastic. Let me know if I can help. Somewhat related: Can FFTs currently be applied to vex::multivector objects? That might be another mechanism for achieving batching.
I've added the none
kind in 34f565b0edd32514392fced3ce053aa7a69b9b28, how data in the batch is ordered is determined by the position of the none
, i.e. in FFT({b,n,n}, {none, forward, forward})
uses n * n * batch_index + elem_index
, with FFT({n,n,b},{forward,forward,none})
data is interleaved, i.e. batch_index + b * elem_index
. With interleaved data, processing one batch_index per thread should be optimal, but it's currently implemented the other way around, and using interleaved data will transpose the array twice. I'll fix this next month.
The FFT doesn't support multivector at the moment, I was planning to support it as a way to split real/imag components instead of interleaving them using cl_double2
, but not for batching, I don't think that's very useful since the number of components is determined at compile-time.
I plan to implement a boost::multi_array-like type though, which should be useful for batching because it's easy to create views of sub-arrays, i.e. with ndvector<3> batch({b, n, n}); fft(batch);
using batch[0]
would return a ndvector<2>
containing the result of the first 2D-FFT, batch[1]
contains the next etc, no need to manage the offset manually.
Looks great. I'll give it a try. Makes sense about the vexcl::multi_vector.
Hi Pascal,
That does look great! I've cherry-picked 0f3074638b and 34f565b0ed (I don't want to pick summed area tables algorithm yet, as its currently nvidia-specific).
re: multi_array
I've tried to come up with an interface for the multi_array type in #31. What do you think of it? I don't currently have time to actually implement this, but contributions are welcome :).
@neapel, is vex::slicer functionality enough for your fft needs?
Looks good, I'll try to look at it (and multivectors) for the FFT next week.
Hi,
In many applications one has to calculate FFTs in batches. A simple example would be if there are many data sets that have to be processed or if only certain dimensions in a multi-dimensional array are to be transformed. Especially for small FFT sizes it can be beneficial to execute the batched FFTs in one call.
As far as I can tell this capability is not present in vexcl/fft at the moment or at least it is not exposed in the public API. Any recommendations as to how to add this capability? I could contribute.
Cheers, Dominic