vincefn / pyvkfft

Python interface to VkFFT
MIT License
51 stars 6 forks source link

FFT of transposed array throws exception #31

Closed jimmyjamison closed 9 months ago

jimmyjamison commented 1 year ago

I found that I get a floating point exception when computing a 1D FFT on a transposed multidimensional array. This can be useful since its faster to compute 1D FFTs along axis=-1.

This code reproduces the issue on my machine

import cupy as cp
import pyvkfft.fft
import pyvkfft.version

print(f"version: {pyvkfft.version.vkfft_version()}")

x = cp.ones((1,100,1400), dtype=cp.complex64)
print(f"x shape: {x.shape}")

# transpose and copy data to new array, works fine
x_t1 = x.transpose((0,2,1)).copy()
print(f"this should work (shape: {x_t1.shape})")
pyvkfft.fft.fftn(x_t1, ndim=1, axes=-1)

# transpose w/out a copy, throws exception
x_t2 = x.transpose((0,2,1))
print(f"this should throw an exception (shape: {x_t2.shape})")
pyvkfft.fft.fftn(x_t2, ndim=1, axes=-1)

when I run this script I get

(venv3) [leolabs@hqsr-worker-r1-1 jimmy]$ python fft_transpose_issue.py 
version: 1.3.0
x shape: (1, 100, 1400)
this should work (shape: (1, 1400, 100))
caching VkFFTApp with:  None
this should throw an exception (shape: (1, 1400, 100))
caching VkFFTApp with:  None
Floating point exception (core dumped)

I'm using pyvkfft version 1.3.0 with CUDA version 12.2.

Thanks for the help!

vincefn commented 1 year ago

Thanks for the report

I found that I get a floating point exception when computing a 1D FFT on a transposed multidimensional array. This can be useful since its faster to compute 1D FFTs along axis=-1.

Note: If you transpose without copy(), it won't make a difference in speed that the transform is made along the last axis - what can influence the speed are the strides of the transform, and these don't change by transposing (fast axis is unchanged).

But still it should work. I just tested the same code using OpenCL and it passed.

Can you update to the latest release ? Your output above says VkFFT 1.3.0, but the release version (for pyvkfft 2023.2) includes VkFFT 1.3.1. There were a number of fixes before the release.

vincefn commented 1 year ago

I just tried your script on an x86 machine with cuda and cupy, and it passed without issue. So I guess this was due to VkFFT 1.3.0.

There's always a long release process to validate (py)vkfft before an official release (see #25), so it's much safer to stick with these releases.

jimmyjamison commented 1 year ago

Note: If you transpose without copy(), it won't make a difference in speed that the transform is made along the last axis - what can influence the speed are the strides of the transform, and these don't change by transposing (fast axis is unchanged).

Ah duh, that makes sense. Thanks for explaining that!

I'll see if updating to the latest release fixes this issue.