Closed franzpoeschel closed 4 months ago
Regarding the
ws2_32
lib forgethostname
- there is a more portable way to achieve this with standard MPI calls: canMPI_Get_processor_name
in chapter 9 of the MPI standard:https://www.mpi-forum.org/docs/mpi-4.0/mpi40-report.pdf
Note that this sometimes appends and sometimes does not append the CPU id as well.
Using the MPI call is not ideal either, since MPI is not always available and the call suffers from exactly the same trouble:
*** The MPI_Get_processor_name() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
This is in practise a lesser trouble since most applications that use MPI also have it initialized, but I think the better solution is to make the used implementation explicit: I was intending for this call to have multiple implementations anyway:
// include/openPMD/ChunkInfo.hpp
namespace host_info
{
enum class Method
{
HOSTNAME
};
std::string byMethod(Method);
#if openPMD_HAVE_MPI
chunk_assignment::RankMeta byMethodCollective(MPI_Comm, Method);
#endif
std::string hostname();
} // namespace host_info
// src/Series.cpp
if (viaJson.value == "hostname")
{
return host_info::hostname();
}
else
{
throw error::WrongAPIUsage(
"[Series] Wrong value for JSON option 'rank_table': '" +
viaJson.value + "'.");
};
The thought behind this is: In some setups, you want writer and reader to match by hostname, in other cases by NUMA node, in other cases by CPU id, you might even want to group multiple hosts into one group. Since the chunk distribution algorithms in #824 use this rank table as a basis for distributing chunks, this keeps that choice flexible for the user.
My suggestion hence: Split this into POSIX_HOSTNAME, WINSOCKS_HOSTNAME, MPI_HOSTNAME
. Users would then need to inquire which options are available on their systems. The enum class Method
would always contain all options without any #ifdef
s, and we could add a call host_info::method_available(Method) -> bool
.
This way, users would explicitly need to state that they want to use the Winsocks implementation, along with all implications that that has.
This is the first logical part of #824 which I am now splitting in two separate PRs:
The idea is that a writer code can either explicitly use:
.. or alternatively initialize the Series with JSON/TOML parameter
rank_table
:(The second option is useful as it requires no rewriting of existing code.)
A 2D char dataset is then created, encoding the per-rank information line by line, e.g.:
A reader code can then access this information via:
And compare that information against
unsigned int WrittenChunkInfo::sourceID
as is returned byavailableChunks()
: