gpu-fftw / gpu_fftw

Run FFTW3 programs with Raspberry Pi GPU - fast ffts!
Other
42 stars 8 forks source link

Override FFT3W... **** FAILED **** #1

Open colin4124 opened 7 years ago

colin4124 commented 7 years ago

I run it in RPi 3, how to solve this:

gpu_fftw - Version 0.1.3-2-g3380

GPU FFT forward/reverse error = 0.147825ppm (nrms error)
GPU_FFTW/FFTW difference = 0ppm (nrms error)
GPU FFTW 1.56045 times faster (21267.1 ffts/sec, 47.021 usec/fft, fftw3: 13628.8 ffts/sec)

Override FFT3W... **** FAILED ****
Test suite ***** FAILED
sergey-suloev commented 7 years ago

gpu_fftw - Version 0.1.3-2-g3380

GPU FFT forward/reverse error = 0.140472ppm (nrms error) GPU_FFTW/FFTW difference = 0ppm (nrms error) GPU FFTW 0.972877 times faster (31611.6 ffts/sec, 31.634 usec/fft, fftw3: 32492.9 ffts/sec)

Override FFT3W... FAILED Test suite ***** FAILED

gtoal commented 7 years ago

Same problem. I tried substituting armv7 for armv6 (and armv7l for armv6l) in the makefile but that didn't help (wouldn't compile - 'target CPU does not support ARM mode')

It looks to me that when running on an armv7, when compiled for armv6, the code is falling back to fftw3 mode which is why the runtimes are about the same for both calls.

gcc -march=armv7 -mfloat-abi=hard -mfpu=vfp -O3 -ffast-math -pipe -mtune=arm1176jzf-s -fstack-protector --param=ssp-buffer-size=4 -std=c99 -Wall -g -fvisibility=hidden -shared -fpic -Wl,-soname,libgpufftw.so -o libgpufftw.so.1 gpu_fftw.c gpu_fftw_util.c hello_fft/mailbox.c he llo_fft/gpu_fft.c hello_fft/gpu_fft_base.c hello_fft/gpu_fft_twiddles.c hello_fft/gpu_fft_shaders.c -ldl -lfftw3 gcc -march=armv7 -mfloat-abi=hard -mfpu=vfp -O3 -ffast-math -pipe -mtune=arm1176jzf-s -fstack-protector --param=ssp-buffer-size=4 -std=c99 -Wall -g -fvisibility=hidden -shared -fpic -Wl,-soname,libgpufftwf.so -o libgpufftwf.so.1 gpu_fftwf.c gpu_fftw_util.c hello_fft/mailbox.c hello_fft/gpu_fft.c hello_fft/gpu_fft_base.c hello_fft/gpu_fft_twiddles.c hello_fft/gpu_fft_shaders.c -ldl -lfftw3f VSN="git describe --always --tags --abbrev=1 | sed 's/^v//'"; \ echo "#define GPU_FFTW_VSN \"$VSN\"" > vsn.h gpu_fftwf.c:1:0: error: target CPU does not support ARM mode

CharlieAt commented 6 years ago

there appears to be two problems.

sergey-suloev commented 6 years ago

I am not interested in this anymore. I am using classical fftw. Can we close this ?

Dobbley2 commented 6 years ago

Hi CharlieAt, I'm trying to get this working on a pi zero. What are the changes you made?

CharlieAt commented 6 years ago

Hi Dobbley2,

copy the current hello_fft (/opt/vc/src/hello_pi/hello_fft/) over the hello_fft in the project. then as per the 2 patches adapt the template file to account for the changed complex structure and make the LD_PRELOAD paths absolute (included patch assumes the .so files are in the current working directory) 0001-add-abs-path-for-preload.patch.txt 0001-fix-for-updated-hello_fft.patch.txt

Dan-escu commented 5 years ago

Is there a solution to this for the pi 3B? Attempting to run fftw in real-time with minimal delay and this may be the key! Thanks