On some platforms, such as JUWELS booster, the memory passed into MPI needs to be registered to the network card, which can take a long time. When using this option, all ranks can do this at once in initialization instead of one after another as part of the communication pipeline.
[ ] Small enough (< few 100s of lines), otherwise it should probably be split into smaller PRs
[ ] Tested (describe the tests in the PR description)
[ ] Runs on GPU (basic: the code compiles and run well with the new module)
[ ] Contains an automated test (checksum and/or comparison with theory)
[ ] Documented: all elements (classes and their members, functions, namespaces, etc.) are documented
On some platforms, such as JUWELS booster, the memory passed into MPI needs to be registered to the network card, which can take a long time. When using this option, all ranks can do this at once in initialization instead of one after another as part of the communication pipeline.
const
isconst
)