Global indices should support peta-scale parallelization

wdeconinck commented 12 years ago

Not that we will get there anytime soon, but our data-structure should allow vertex and element GLOBAL indices that have higher values than the 32 bit limitation of unsigned int. (around 4-billion).

Second reason: node-matching algorithms use higher-precision than 32-bit. If we could use global indices of 64 bits, then renumbering algorithms may not be necessary for load-balancing and mesh-partitioning.

Concrete:

List should be used for storing global indices.
PE::CommPattern should be aware that these global-indices can be enormous.

I foresee mainly an issue with the PE::CommPattern. An alternative way has to be found to map the global to local indices. Obviously we cannot store a std::vector with size larger than a billion, to map it to a local index. Anyhow, what we currently have is not a good solution for the current case either, where the 32-bit limit is also already too high to store a vector of this magnitude on every rank.

Could I assign @tbanyai for the CommPattern, and once that is done, I myself will take the mesh part on me.

tlmquintino commented 12 years ago

Nice to be thinking ahead. I think the second reason is more pressing than arriving to peta-scale computing - for unstructured, on which machine will you generate such mesh? Anyway, solving this will mean a nice step forward.

wdeconinck commented 12 years ago

The node-matching algorithms need higher precision than 32 bit indeed, and is a good reason to have these glb_idx to be 64 bit.

A current work-around, is that I do node-matching with 64 bit, store it temporarily, then number the global indices to be contiguous in space again, counting on the fact that we are not doing anything peta-scale yet :p

tbanyai commented 12 years ago

There are some limitations on the mpi side: counts of items are taken as ints by MPI_*. However the immediate bottleneck is to store gids in a large enough format which is already uints, the rest of the data is process-local sizes. Changing to long uint for gids is work but not a fundamental change.

wdeconinck commented 12 years ago

It would be a nice feature to have peta-scale parallelization, and it would be nice to test distributed mesh-partitioning of this kind. I am not sure of the memory-requirements of a machine yet that could perform such duty.

I did a check and the amount of files needing global indices is not substantial yet. This change should be addressed rather soon, before other algorithms will use it, and feel restricted by the 32bit limitation. I myself already felt it during the coordinate matching algorithms, and had to create workarounds.

barche commented 12 years ago

Just one thought about this coordinate matching: we should be careful to only use this mechanism when there is no alternative. Ideally, index calculation should be based on existing ghost information and topology only, not coordinates. Some time (after the presentation and after the physics design meeting), we should sit down and look in detail at the use cases and make sure this mechanism is only used when necessary.

wdeconinck commented 12 years ago

Coordinate matching algorithms are only necessary to match nodes or elements existing on different processors, but that don't have a global index assigned yet. If global indices are present, that is already the best way to match nodes. The gmsh reader e.g. takes the global index directly from file. Other mesh-generators or mesh-readers should also provide this information. If not, the (not so expensive) current labeling procedure can be used, which is already based on ghost-information. The mpi-communication of the current algorithm can still be improved.

Nodes that typically don't have global indices are the ones created from a new high-order space. It is always good practice that every vertex and element have a unique index. Hence, at space-creation, this labeling procedure is used.

This issue is however not about how global indices are assigned, only about their number of bits.

tlmquintino commented 12 years ago

Indeed we should rely on the existing global indices when available, but this algorithm is paramount for HO. I can not stress that enough: before this we had to implement much more complicated HO mesh enrichment procedures.

coolfluid / coolfluid3

Global indices should support peta-scale parallelization #196