flintlib / flint

FLINT (Fast Library for Number Theory)
http://www.flintlib.org
GNU Lesser General Public License v3.0
401 stars 235 forks source link

sd fft no blocks #2005

Closed tthsqe12 closed 4 weeks ago

tthsqe12 commented 1 month ago

sd stands for scalar double, and now, true to its name, the fft data is just a plain array of doubles.

By the way, the sd_fft_ctx_fit_depth function was previously called in the now-defunct sd_fft_lctx constructor. This fit_depth function was not thread safe before, and maybe its uses didn't demand thread safety. Now, it is moved to a place where it definitely needs to be thread safe. Everything is a lot simpler now.

albinahlback commented 1 month ago

Is there any performance hits/increases to be expected with this PR?

albinahlback commented 1 month ago

And nice to hear from you again, Dan! I hope you are doing well!

fredrik-johansson commented 1 month ago

Ready to merge? Really nice cleanup!

tthsqe12 commented 1 month ago

If this ^^ one works, it should be good to go. No differences to performance @albinahlback, but if you have intel, I have found out why it is unexpectedly slower than amd.

albinahlback commented 4 weeks ago

If this ^^ one works, it should be good to go. No differences to performance @albinahlback, but if you have intel, I have found out why it is unexpectedly slower than amd.

It's due to vroundpd, right?