ejmahler / RustFFT

RustFFT is a high-performance FFT library written in pure Rust.
Apache License 2.0
678 stars 47 forks source link

AVX Performance Tips: choosing size N #49

Closed nilgoyette closed 3 years ago

nilgoyette commented 3 years ago

You wrote a detailled text in your documentation explaining that choosing the right size will make rustfft much faster. I'm interested (!) but I don't understand how I'm supposed to choose the size.

I'm calculating the 2D FFT of thousands of image slices of variable sizes. One image can be 112x112, another 97x128, etc. But lets keep it simple. I had the results I wanted when planing and allocating for 112. Good. Now I read your doc and try to allocate for 128 (nearest 2^N * 3^M).

// Using ndarray, `data` is the 2D image
// 2D FFT was working perfectly without the size optim
for &d in &[0, 1] {
    let real_size = data.shape()[d];
    for mut data_1d in data.axis_iter_mut(Axis(if d == 0 { 1 } else { 0 })) {
        buffer.slice_mut(s![..real_size]).assign(&data_1d);
        buffer.slice_mut(s![real_size..]).fill(Complex::default());
        fft.process_with_scratch(buffer.as_slice_mut().unwrap(), &mut scratch);
        data_1d.assign(&buffer.slice(s![..real_size]));
    }
}

but filling the buffer with 0.0 (in fact, Complex::default()) does change the results. So how can I choose the buffer size if it changes the results? Disclaimer: I'm not a mathematician :)

ejmahler commented 3 years ago

Yeah, unfortunately changing the size does change the results -- which is why that section recommends starting with the size that's most convenient, and only switching if that doesn't work for you.

Especially if all your sizes are below 1000, I don't think you will literally ever notice the performance hit form a "slow" FFT size.

As for how to handle the changed outputs, it heavily depends on your application. I can't say much more, because I don't know too much more.

nilgoyette commented 3 years ago

Oh, ok. There's no magic trick :)

You may want to edit your documentation then. It gives the impression that there's no problem changing the size of a fft, that it will simply be faster. Yes, you "recommend" starting with the exact size, but then you recommend trying something else if it's too slow. Of course I tried it ^^ Again, I'm not a mathematician, so maybe what I'm suggesting is obvious.

Especially if all your sizes are below 1000, I don't think you will literally ever notice the performance hit form a "slow" FFT size.

Yes, that's the case. I was just checking because I'm calling process hundreds of times in a program that will be called thousands of times. Anyway, rustfft5 is already fast enough. A big thank you for your incredible work!

ejmahler commented 3 years ago

You have a good point. I'll see if i can reword it to qualify choosing a size on your application supporting it.