Closed tthsqe12 closed 4 weeks ago
Is there any performance hits/increases to be expected with this PR?
And nice to hear from you again, Dan! I hope you are doing well!
Ready to merge? Really nice cleanup!
If this ^^ one works, it should be good to go. No differences to performance @albinahlback, but if you have intel, I have found out why it is unexpectedly slower than amd.
If this ^^ one works, it should be good to go. No differences to performance @albinahlback, but if you have intel, I have found out why it is unexpectedly slower than amd.
It's due to vroundpd
, right?
sd stands for scalar double, and now, true to its name, the fft data is just a plain array of doubles.
By the way, the sd_fft_ctx_fit_depth function was previously called in the now-defunct sd_fft_lctx constructor. This fit_depth function was not thread safe before, and maybe its uses didn't demand thread safety. Now, it is moved to a place where it definitely needs to be thread safe. Everything is a lot simpler now.