Technical issues found during the switch to templates

mreineck commented 1 month ago

While introducing the templates I encountered a few strange things I'd like a second opinion on...

in spreadinterp.h, a function spreadinterp is declared, but it is not used anywhere in the library. But there are some test codes (perftest/spreadtestnd.cpp and perftest/spreadtestndall.cpp) which are still calling this routine. Is this obsolete or should it be kept?
some test codes call FFTW-specific functions for forgetting wisdom and cleaning up internal state. I gather that this is mostly used during benchmarking, when FFTW should be forced to forget its internal state completely. I could provide this by adding a function like forgetInternalState() to the FFT interface. Would that be a useful change? Leaving things as they are now feels a bit awkward.
spreadinterp.cpp has a prologue of about 100 lines that forward-declare things which are defined later on anyway. Would you be OK with me bringing the contents of the file in the right order, such that all these declarations are no longer necessary? I'm asking because the introduction of the templates will make this rather uglier than nicer.

Apart from this, I think I see a clear way towards the introduction of templates. The individual steps would be:

macros -> constexpr (#482)
switch to C++ random numbers (#483) (BTW this should allow running some tests on Windows which didn't compile up to now.)
templatizing the FFTW interface (more or less done)
templatizing the content of spreadinterp.h/cpp and utils.h/cpp
templatizing the plan type and finufft.cpp and adjusting simple_interfaces.cpp (this is a fairly big task, but hard to subdivide).

DiamonDinoia commented 1 month ago

Hi Martin,

Thanks for looking into this. Answers:

we usespreadinterp, at some point we might provide FINUFFT_MEASURE (or similar) where we'll be likely to use that to gather data.
I think is a good idea to wrap FFT specific functionality and abstract them away
I do not have any opinion on forward declaring I think the code is understandable in any order. If you believe is better to rid of those go ahead. The file also got way too big we should consider splitting it into 3 files: kernel, spreader and interp as compilation now takes way too long (no parallelism...)

mreineck commented 1 month ago

Thanks for the answers!

Concerning 3), I fully agree that this should be split at some point. If that is planned anyway, then I won't reorder the file now; all the cosmetics can be done at a later point.

DiamonDinoia commented 1 month ago

Maybe worth splitting it before reordering? My issue on splitting is the makefile... Any change has to be propagate there

mreineck commented 1 month ago

I agree with splitting before reordering. My point was that this should probably be postponed until after templatization, to limit the amount of potential chaos :-)

ahbarnett commented 1 month ago

Hi Martin & Marco,

Good to e-meet you yesterday, Martin.

Good catch. As you can probably guess, spreadinterp() was written first, and became unused by finufft.cpp when we switched to batched transforms around 2019. It is so short that I see no harm in keeping it around as the reference of how to do a batch-size-1 spread/interp. There is a risk of it drifting out of date relative to finufft.cpp:execute(). But for now I would like still to be able to call it, rather than move it to spreadtestnd.cpp
Agree re fft state abstraction to fft.cpp. You could make it a separate issue.

Random numbers are only used in test codes; I feel that is a distraction from the main goal. We should resist the temptation to try and clean up all aspects of the code at once. Gradual change is better!

I agree with the overall plan. Following the fixing of the simple interface is the final step to change (c)make to remove the twice-complation (-DSINGLE). I don't know templating enough to help much with that. Then there is the test codes to remove FLT from; I could help with that.

I agree with split spreadinterp.cpp into evalkernel.cpp, spread.cpp and interp.cpp (and possibly simdutils.cpp which are used in all three). I think they should share a spreadinterp:: namespace though, as now.

Re template / split / reorder: shouldn' the reordering (& removal of declarations) be done as part of the splitting? (easiest for human to do in 1-pass). And then templating done afterwards (to avoid repetition of changing both declarations and definitions)?

Best, Alex

On Wed, Jul 17, 2024 at 11:06 AM Marco Barbone @.***> wrote:

Maybe worth splitting it before reordering?

— Reply to this email directly, view it on GitHub https://github.com/flatironinstitute/finufft/issues/484#issuecomment-2233550045, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSTVZEFCD3EJ2DCN7VDZM2B55AVCNFSM6AAAAABLAQ6KTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZTGU2TAMBUGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- *-------------------------------------------------------------------~^`^~._.~' |\ Alex Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

mreineck commented 1 month ago

Random numbers are only used in test codes; I feel that is a distraction from the main goal. We should resist the temptation to try and clean up all aspects of the code at once. Gradual change is better!

I agree in principle, but in this particular case I had to do something. By switching from a macro containing a call to rand_r to an inline function with a call to rand_r, the compiler got to actually see it, and on Windows rand_r doesn't exist any more, so I got CI failures. Rather than patching around the issue I thought I'd do the proper switch right away, especially since there was a comment in the sources that this should happen at some point.

ahbarnett commented 1 month ago

Ah, but this makes it a Windows-fixing issue, of secondary importance after we get the code looking how we want.

I just had a realization about how to tidy this up: Since rand #s are not part of finufft (they are only part of the test codes), then the rand macros in defs.h, if they are now becoming functions, should be moved to a new module test_utils.cpp which is only used by test codes (not finufft itself).

That is the logical solution if we are going to start changing rand stuff. (it's true that Wenda did push for change to other rand # gens, and it made it onto some to-do list, but it was not a priority...)

On Wed, Jul 17, 2024 at 12:06 PM mreineck @.***> wrote:

Random numbers are only used in test codes; I feel that is a distraction from the main goal. We should resist the temptation to try and clean up all aspects of the code at once. Gradual change is better!

I agree in principle, but in this particular case I had to do something. By switching from a macro containing a call to rand_r to an inline function with a call to rand_r, the compiler got to actually see it, and on Windows rand_r doesn't exist any more, so I got CI failures. Rather than patching around the issue I thought I'd do the proper switch right away, especially since there was a comment in the sources that this should happen at some point.

— Reply to this email directly, view it on GitHub https://github.com/flatironinstitute/finufft/issues/484#issuecomment-2233680912, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSSJPHN56SAMPX7GZK3ZM2JBFAVCNFSM6AAAAABLAQ6KTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZTGY4DAOJRGI . You are receiving this because you commented.Message ID: @.***>

-- *-------------------------------------------------------------------~^`^~._.~' |\ Alex Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

DiamonDinoia commented 1 month ago

Wenda suggested using random123 #200 which is an external dependency. I do not think is a bad idea to use a faster generator for the tests. cmake can pull it for test only. I personally implemented xoshiro in Google Highway at some point which I think is still the fastest rng on CPU: https://github.com/google/highway/tree/master/hwy/contrib/random

But before that, std::minstd_rand might be fast enough. https://en.cppreference.com/w/cpp/numeric/random

mreineck commented 1 month ago

I'd not use anything complicated for just running unit tests; std::minstd_rand should do the trick just fine.

But anyway, let's skip the RNG pull request; once the 1/n PR is in, I'll prepare follow-ups without the RNG changes.

ahbarnett commented 1 month ago

I will emphasize that multithreaded rand is needed for the tests, since single-thread rand is very slow (you'll see my openmp rand hacks... would love to get rid of them from test codes :).

On Wed, Jul 17, 2024 at 2:21 PM mreineck @.***> wrote:

I'd not use anything complicated for just running unit tests; std::minstd_rand should do the trick just fine.

But anyway, let's skip the RNG pull request; once the 1/n PR is in, I'll prepare follow-ups without the RNG changes.

— Reply to this email directly, view it on GitHub https://github.com/flatironinstitute/finufft/issues/484#issuecomment-2233963875, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNZRSTTR7CBF22KC6RFSDDZM2YX3AVCNFSM6AAAAABLAQ6KTKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZTHE3DGOBXGU . You are receiving this because you commented.Message ID: @.***>

-- *-------------------------------------------------------------------~^`^~._.~' |\ Alex Barnett Center for Computational Mathematics, Flatiron Institute | \ http://users.flatironinstitute.org/~ahb 646-876-5942

mreineck commented 1 month ago

The multithreaded random number generation can stay in any case and is used with the standard C++ RNGs in the changes I made. I don't think it's possible to hide that layer of complexity fro the users ... they have to know how exactly the random numbers are generated.

mreineck commented 1 month ago

Adjusting finufft.cpp and simple_interfaces.cpp turns out to be a deep rabbit hole. Complications are caused for example by

having two slightly different sets of macro magic (one in finufft.h and friends, one in defs.h)
having precision-independent and precision-dependent declarations mixed in defs.h
practically all tests relying on knowledge of implementation details by including defs.h

So the last step of this adjustment will touch very many files, and potentially change them a lot. Some of the header files will have to be splt. Also, as a byproduct of the change, we get a "natural" C++ interface to the library, and should probably think how it should look like exactly. The rough outline will be something like

Finufft_plan<double> plan(opts);
plan.setpts(x,y,z); // arguments can be pointers or vectors
plan.execute(uni, nonuni); // arguments can be pointers or vectors
// no need to destroy the plan, this is done by the destructor

Anyway this is not urgent, since 2.3 has to be released first. I just wanted to point this out as I found it.

DiamonDinoia commented 1 month ago

Having a nice c++ interface would be nice. We can always use it internally until it is stable enough and release it in a future maybe major version.

We should consider that CPU/GPU API should be consistent so these changes needs to be propagated before release.

I like the way it looks, then we can also allow mixed precision with can help speedup especially GPU code.

DiamonDinoia commented 1 month ago

@mreineck, I would like to provide attention to is that we use -fPIC in finufft quite often. This causes the compiler not to inline some functions causing a performance overhead source. When changing let's make symbols static of in anonymous namespaces so that does not happen. Not sure if there are any compilation tricks that we can pull here, like building all the internal functionality in a separate lib with no -fPIC and static link against it.

flatironinstitute / finufft

Technical issues found during the switch to templates #484