Closed edouardoyallon closed 9 years ago
Thank you for investigating that. A minimal (non-)working example on a gist would be great. This non-deterministic behavior looks a lot like a multi-threading problem in shared-memory parallelism.
Fair enough, next time it crashed I'll be here. I think a huge insights of https://github.com/torch/torch7/blob/master/doc/tensor.md is required.
Hey, you can check the unit test of my fft... Actually, it's not even computing the correct FFT( iff and original signal are not equal) and it crashes. Clearly the issue is my pointer is getting crazy
Thought I'm not sure to understand correctly what is the difference between a n-dimensional tensor and a 2D view of it, it seems I've to use the "GURU FFT" > https://groups.google.com/forum/#!topic/comp.lang.fortran/YNzV4dSPHpM
In the doc: http://www.fftw.org/fftw3.pdf > 4.5.2 Guru FFT
Let's go for that! Btw, it's pretty multi-dimensional and in the NVIDIA one: http://docs.nvidia.com/cuda/cufft/#axzz3fUy94g9b
Branch "FFT/CUDA_FFT", I tried to manage to make work the "GURU_FFT", yet it returns a NULL structure. Looking for some help yet the documentation is not clear. I wonder if MATLAB's wrapper of fft is available somewhere..
Bug is fixed via guru FFT. Here is an intuition of why: assume you have a tensor 2x3x2, and say you reshape it into 2x6, such that its storage is: 1 1 2 2 3 3 4 4 5 5 6 6 where n n is a row sequence, 1<=n<=6. Then, the tensor 2x2x3 would have been stored such that the sequence would have been:
1 2 3 1 2 3 4 5 6 4 5 6. As you can see, they have different strides, and fortunately, FFT guru can handle those different strides. CUDA Guru works exactly in the same way. Oof, I'm merging the branch!
Well donne!!
Il giorno 12/lug/2015, alle ore 12:21, Edouard Oyallon notifications@github.com ha scritto:
Bug is fixed via guru FFT. Here is an intuition of why: assume you have a tensor 2x3x2, and say you reshape it into 2x6, such that its storage is: 1 1 2 2 3 3 4 4 5 5 6 6 where n n is a row sequence, 1<=n<=6. Then, the tensor 2x2x3 would have been stored such that the sequence would have been:
1 2 3 1 2 3 4 5 6 4 5 6. As you can see, they have different strides, and fortunately, FFT guru can handle those different strides. CUDA Guru works exactly in the same way. Oof, I'm merging the branch!
— Reply to this email directly or view it on GitHub.
Hi,
FFT sometimes crashes due to segfault, that'd be great to investigate it. I'm pretty sure it is a pointer's problem(segfault), since sometimes it crashes, sometimes not, just on a simple for loop with 4x4 signals.