Open bangerth opened 3 years ago
Follow-up: The good people from VeloC have spent a good amount of time measuring efficiency of various serialization libraries. As these things go, the one we're using (BOOST) came out at the bottom: it's about 10x slower than the best libraries:
If it turns out that serialization is ever a bottleneck, that's where we ought to look.
See also the table here: https://github.com/fraillt/bitsery
I'm listening to the annual reviews of the Exascale Computing Project and learned about the VeloC project (https://www.anl.gov/mcs/veloc-very-low-overhead-transparent-multilevel-checkpointrestart) that provides checkpointing services that, for example, do the actual I/O in the background.
For those of you who have run computations on 10,000 or more processors, is checkpointing a concern on large machines? The amounts of data that need to be written are certainly huge, but I don't know whether it is something that needs to be addressed.