Closed francoishamon closed 2 years ago
We are using metis as a partitionner ? I experienced empty partition with metis in other codes, we used to fix it by moving one element to empty domains. Also we used Scotch (https://gitlab.inria.fr/faverge/scotch/-/tree/master) which is less prone to create empty partition. Last releases were on Inria gforge which is not available anymore, I can ask the team member where they put their new releases, there are also tags in the git repo.
I've been thinking about this myself recently. VTK uses a simple octree aligned with global axis to split the mesh, which should produce reasonably load-balanced partitions for "normal"-looking meshes (e.g. those resembling a box), but can produce somewhat unexpected results for certain configurations (e.g. the "staircase" tetrahedral mesh). Empty partitions seem entirely possible, for example with slanted/dipping domains.
My own problems with it come from the fact that it can sometimes produce partitions that are not contiguous in the TPFA sense, i.e. have cells that are not connected to the rest through a face. This leads to some problems when creating an agglomeration-based coarse mesh for the multiscale method. I can "downgrade" it to using node-based connectivity to define local agglomerates and it works, but the quality of basis functions and convergence of the method take a hit. I'm sticking to PAMELA import for that reason.
It should also be noted that that type of partitioning, while fast and useful for visualization and data processing, is pretty bad for dynamic simulation, since it's not minimizing MPI communication (edge-cut).
I think a possible long-term solution is to use a graph-based algorithm (ParMETIS?) to refine the initial partitioning and redistribute the mesh one more time. This can happen through a contribution to VTK or fully on GEOSX side, with the latter probably being a more realistic scenario. This is not an easy task, but VTK has some built-in tools for redistribution. Internally it uses a library called diy2 for this and provides compatible facilities like data serialization for unstructured grid datasets. We might be able to take advantage of it. The closest example is here.
Ok it seems I was a bit off-topic. I didn't know how VTK partitionning worked.
Ok it seems I was a bit off-topic.
You were not :) Your comment might be extremely relevant if we still have problems like empty ranks after repartitioning attempt. At the moment we just need to come up with a plan to even get to that point.
I think a possible long-term solution is to use a graph-based algorithm (ParMETIS?) to refine the initial partitioning and redistribute the mesh one more time. This can happen through a contribution to VTK or fully on GEOSX side, with the latter probably being a more realistic scenario. This is not an easy task, but VTK has some built-in tools for redistribution. Internally it uses a library called diy2 for this and provides compatible facilities like data serialization for unstructured grid datasets. We might be able to take advantage of it. The closest example is here.
To be more concrete, I think the implementation could be:
vtkPartitionedDataSet
(a collection of sub-meshes) based on the target rank vector - there doesn't seem to be an existing filter for that (or I can't find it), but it can be done using a series of vtkExtractCells applications.diy::all_to_all
(similar to this) to redistribute the partitions to target ranks.@AntoineMazuyer if you have any contacts on the VTK team that could take a quick peek and either greenlight this plan or tell us reasons why it won't work, that would be great.
Hi @klevzoff,
Sure thing.
One question : why don't we use ParMETIS at step one ?
@AntoineMazuyer sure, we probably could do that. My impression was that in order to use ParMETIS, you need to give it a distributed graph, so we need to split the mesh anyway (unless we read in a pre-partitioned mesh from a .pvtu
), although we don't have to physically redistribute the entire dataset. If VTK exposes the tools to octree-partition the mesh without distributing it, that would probably be optimal.
Hello, I need help with a simulation on an unstructured mesh with 1,915,000 cells (prims) that I load into GEOSX using the
VTKMeshGenerator
. The mesh looks good and the simulation goes well when I use a small number of MPI ranks, but I am struggling with the load balancing between MPI partitions:develop
. Withfeature/klevzoff/vtk-mesh-update
, I can leave the mesh generation step, which is great, but GEOSX does not like empty MPI partitions.The domain/mesh/cell shapes are not unreasonable, so I am looking for ideas to overcome this issue. Is there a way to control MPI partitioning inside the VTK framework that I can try/implement?
I can send an image of the MPI partitions on Slack to anyone interested in this issue.