SCOREC / EnGPar

dynamic load balancing
http://scorec.github.io/EnGPar/
BSD 3-Clause "New" or "Revised" License
7 stars 4 forks source link

support split then balance workflow without requiring user data migration #14

Closed cwsmith closed 6 years ago

cwsmith commented 6 years ago

We want to run a engpar splitting (increase part count from s to t, where s<t and s*splitFactor=t) function followed by an engpar balancer without requiring the user/application to

  1. run their own (expensive) migration routines and
  2. build the engpar (hyper)graph twice.

So, we will need to

  1. call the splitting function on an s sized communicator
  2. switch to the t sized MPI communicator
  3. create 'empty' parts on the new t-s processes
  4. call engpar migrate
  5. call the balancer
  6. revert back to the s communicator to return control back to the application

This should be done in the user's application code, similar to what is done here: https://github.com/SCOREC/core/blob/8c462070cc75172b931cef0ee08079762b01babe/test/repartition.cc

cwsmith commented 6 years ago

This commit contains a split + balance test: https://github.com/SCOREC/EnGPar/commit/8226ddd42aff325081005c88641fe21547ae685e

How do the following mesh partitioning workflows compare in total time to create the balanced partition and partition quality?

  1. engpar calling parmetis splitting followed by engpar balancing
  2. parmetis splitting then engpar balancing
diamog commented 6 years ago

For the small test (4 -> 8 processes):

Initial Quality: mesh element imb: 1.46 mesh vertex imb: 1.41 ~16k mesh elements ParMETIS target element imbalance: 1.1 EnGPar target element/vertex imbalance: 1.05

Total time is about the same. EnGPar's split tends to run faster than PUMI, but the resulting partition when using EnGPar requires balancing while the PUMI one does not.

Partition Quality: entity imbalance is roughly the same. Graph metric for duplicated graph vertices (ghosts) is better when using EnGPar for splitting. Mesh metric for duplicated mesh vertices (shared vertices) is better when using PUMI for splitting.

I think the last two points are interesting and may explain some of the metric problems we see when we are comparing EnGPar balancing to PUMI balancing. EnGPar may be better satisfying its own structure (ghosts) rather than the mesh remote copies like we see in this case.

diamog commented 6 years ago

Here are the logs for the above runs engpar_parmetis.txt pumi_parmetis.txt

cwsmith commented 6 years ago

I'm will refer to parmetis called by pumi as 'pumi-parmetis' and parmetis called by engpar as 'engpar-parmetis'.

Overall these results look good. As you said, the avg vertex count in the final partitions are within 1% of each other; pumi-parmetis followed by engpar balancing has the lower average. Likewise the entitiy imbalances are within one percentage point of each other.

The surprising result is parmetis coming up with different answer for the parmetis global split. Can we write the mesh PARMA_STATUS partition stats after pumi-parmetis run? I assume it is reaching the specified target imbalance.

For this test, how is the engpar graph constructed? mesh elements -> graph vtx and mesh faces -> graph edges?

diamog commented 6 years ago

To add what we discussed:

The graph is constructed with mesh elements -> graph vtx and mesh vertices -> graph edges.

Here is a log of the pumi_parmetis run with the added parma status after splitting pumi_parmetis.txt

diamog commented 6 years ago

Here are logs for a 32 part mesh being split to 64 and 128 parts with faces as the edge type being used in engpar: engpar_parmetis_64.log pumi_parmetis_64.log engpar_parmetis_128.log pumi_parmetis_128.log

EnGPar runs in both faster in both cases, the imbalances are comparable, with about a 2% increase in average shared vertices