Open TheWhiteAmbit opened 3 years ago
This does not occur when using cufftType.C2C, maybe i can format just the input data different as a workaround, but this still seems to be a problem for cufftType.R2C
Can you provide a minimal example showing the problem? Is it a 1D, 2D or 3D transform? Are the input data padded?
Because you mention DirectX, I assume you are using 2D transforms, are you sure the array sizes are correct, given that "output arrays twice the size of input arrays" is not correct: For 2D R2C transforms, the output array must be of size (width / 2 + 1) x height
and of datatype cuFloatComplex
, float2
or twice the size of floats.
First, thank you for the great work! I made a workaorund with C2C and don't have the original code anymore. I use a 1D transform with plan many and stride on a Texture2D and it works like charm now :) So this is my redone sample - it should work not to work :) hope I did not miss any edits:
`
void CudaFFTPlanManyOnMappedResource(Texture1D inputTexture, Texture2D outputTexture, uint startIndexOfset = 0)
{
try
{
if (cudaContext == null)
cudaContext = new CudaContext();
cudaContext.SetCurrent();
//int elementCount = inputTexture.Description.Width * inputTexture.Description.Height;
//float[] floatArrayInput = new float[elementCount];
//for (int i = 0; i < elementCount; i++) {
// floatArrayInput[i] = rand.Next(0, 65535);
//}
//float[] floatArrayOutput = new float[elementCount * 2];
//CudaDeviceVariable<float> cudaDeviceInput = new CudaDeviceVariable<float>(elementCount);
//cudaDeviceInput.CopyToDevice(floatArrayInput);
//CudaDeviceVariable<float> cudaDeviceOutput = new CudaDeviceVariable<float>(elementCount * 2);
CudaPitchedDeviceVariable<float> cudaPitchedDeviceInput = new CudaPitchedDeviceVariable<float>(inputTexture.Description.Width, inputTexture.Description.Height);
CudaPitchedDeviceVariable<ManagedCuda.VectorTypes.float2> cudaPitchedDeviceOutput = new CudaPitchedDeviceVariable<ManagedCuda.VectorTypes.float2>(outputTexture.Description.Width, outputTexture.Description.Height);
using (CudaDirectXInteropResource resourceInput = new CudaDirectXInteropResource(inputTexture.NativePointer, CUGraphicsRegisterFlags.None, CudaContext.DirectXVersion.D3D11, CUGraphicsMapResourceFlags.None))
{
resourceInput.Map();
using (var dataInput = resourceInput.GetMappedArray2D(startIndexOfset, 0))
{
dataInput.CopyFromThisToDevice(cudaPitchedDeviceInput);
}
resourceInput.UnMap();
}
if (cudaFftPlanMany == null)
{
if (cudaFftPlanMany != null)
{
cudaFftPlanMany.Dispose();
}
var cudaFftPlanManyWidth = inputTexture.Description.Width;
var cudaFftPlanSizeHeight = inputTexture.Description.Height;
var cudaFftPlanManyWidth = outputTexture.Description.Width;
var cudaFftPlanSizeHeight = outputTexture.Description.Height;
int[] inembed = { 0 };
int istride = 1;
int idist = cudaPitchedDeviceInput.Pitch / cudaPitchedDeviceInput.TypeSize; ;
int[] onembed = { 0 };
int ostride = 1;
int odist = cudaPitchedDeviceOutput.Pitch / cudaPitchedDeviceOutput.TypeSize;
cudaFftPlanMany = new CudaFFTPlanMany(1, new int[] { cudaFftPlanSizeHeight }, cudaFftPlanManyWidth, cufftType.R2C, inembed, istride, idist, onembed, ostride, odist);
}
cudaFftPlanMany.Exec(cudaPitchedDeviceInput.DevicePointer, cudaPitchedDeviceOutput.DevicePointer, TransformDirection.Forward);
cudaContext.Synchronize();
using (CudaDirectXInteropResource resourceOutput = new CudaDirectXInteropResource(outputTexture.NativePointer, CUGraphicsRegisterFlags.None, CudaContext.DirectXVersion.D3D11, CUGraphicsMapResourceFlags.None))
{
resourceOutput.Map();
using (var dataOutput = resourceOutput.GetMappedArray2D(0, 0))
{
dataOutput.CopyFromDeviceToThis(cudaPitchedDeviceOutput);
}
resourceOutput.UnMap();
}
//cudaDeviceOutput.CopyToHost(floatArrayOutput);
cudaPitchedDeviceInput.Dispose();
cudaPitchedDeviceOutput.Dispose();
//cudaDeviceInput.Dispose();
//cudaDeviceOutput.Dispose();
}
catch (ManagedCuda.CudaException)
{
}
}`
It does not work on neither Texture2D or the commented out CudaDeviceVariable
When calling my cuda plan with only one parameter, I can find a transformed Array on the original position. But whenever I call one of the methods with separate input and output parameters, the resulting array is always filled with just zeros. I have this problem on CudaDeviceVariable als well as with CudaPitchedDeviceVariable (mapped from texture as CudaDirectXInteropResource). Array size should be corrext, using cufftType.R2C with output arrays twice the size of input arrays.