Closed crisluengo closed 4 months ago
I totally agree that we should guard against overflow here, but I wonder whether we can't do this by simply rejecting FFT lengths larger than, say, 2**19
. I'd argue that for carrying out an FFT of length N
, at least 8N
bytes of memory are required: 4N
for the input array (assuming 32bit floats), 4N
for the internal copy, and then we also have twiddle factors etc. On a 32bit system a single process can see 4GB of memory at most (or is this a wrong assumption these days?), so any FFT longer than 2**19 will definitely crash by running out of memory.
Am I missing something?
That is a really good point.
I guess you meant 2**29
, which is 2**32 / 8
.
The algorithm should never overflow, if I’m not mistaken, for numbers up to 2**32 / 11 / 2 - 1
(but I need to examine the code again to be sure, I’m doing this from memory right now). The real version doesn’t multiply by 11, so should have a larger limit.
Sorry, must have slipped on the keyboard when typing the 2**19
:-)
How about a pragmatic solution: we turn the good_size
functions into templates that work for uint32_t
and uint64_t
(should cover size_t
for 32bit and 64bit systems). Whenever the 32bit version notices that it can't cope with the input, it will let the 64bit version do the job, and when the returned result is too large for a 32bit machine, an exception is thrown.
I think that’s the ticket.
I assume 64-bit math is more expensive than 32-bit math on a 32-bit architecture, so it’s worth while using the 32-bit version of the code when possible. This is a really good solution!
fixed via #17
Now, this is unlikely to be a problem on 64-bit architectures, because nobody is going to compute an FFT anywhere near the limit for those lengths. But I've run into this on a 32-bit machine.
If the input
n
is larger thanSIZE_MAX / 2
, thenbestfac=2*n
will overflow and the output will be a very small number.If the input
n
is smaller thanSIZE_MAX / 2
, but sufficiently large thatf11*=11
can overflow, then the loopfor (size_t f11=1; f11<bestfac; f11*=11)
will never end. Likewise for the other 4 loops in that function. The other functions, including the newprev_good_size_cmplx()
andprev_good_size_real()
have a similar issue.I've been struggling to write a correct implementation. I still see a different output on 32-bit systems and 64-bit systems for an input such as
n = 2109375001
, which indicates there's still an overflow issue in the 32-bit case. You can watch my progress here: https://github.com/DIPlib/diplib/blob/master/src/transform/dft_optimal_size.cpp Once I figure out how to do this correctly, I will submit a PR to fix the code here too.