SCOREC / core

parallel finite element unstructured meshes
Other
179 stars 63 forks source link

Pcu object #388

Open jacobmerson opened 1 year ago

jacobmerson commented 1 year ago

Thank you. Nice work.

I just noticed this was a draft. What is left to do? It looks like CI tests are passing.

I think it's good I just figured we may want to discuss/make sure we are comfortable before merging.

cwsmith commented 1 year ago

meeting notes

jacobmerson commented 7 months ago

@cwsmith Do you have a "large adapt case" that we can use for testing?

test large scale (100s) adapt case

cwsmith commented 7 months ago

@jacobmerson I have a small/medium sized case here: /lore/cwsmith/projects/deltaWingAdapt/pumiAdapt that can be run on 16 and possibly 32 processes. It depends on the cws/deltaAdapt branch @ b461232.

In addition to this test, we could just partition a mesh to a large number of processes to stress PCU.

jacobmerson commented 7 months ago

Thanks Cameron. Ok, so the plan would be:

  1. rebase cws/deltaAdapt onto pcu-object then run the adapt case at /lore/cwsmith/projects/deltaWingAdapt/pumiAdapt
  2. run a mesh partition process. Do you have thoughts on how big we should go?

@flagdanger we will want to run these tests with the current version of pcu-object branch. Then as you update the Mesh with the pcu-object you can keep running those tests (mostly the partitioning since it will be easier).

cwsmith commented 7 months ago

Sounds good.

https://zenodo.org/records/1194576/files/uprightMeshes.tar.gz contains 28M0.smb which is a serial (single process) pumi mesh with 28M tets. We could partition this using zsplit to a modest number of ranks (16 or 32) then run uniform once to increase the tet count by eight times (it splits all edges of all tets) and then partition to 1024 ranks (again using zsplit). Both these steps should only take a few minutes each of wall time.

Note, zsplit (and all the other partitioning tools) need to be run on the target number of processes. So, to split from 16 to 1024 parts, 1024 processes are needed.

Also, if you don't want to be bothered with the zoltan dependency needed for zsplit you can run split instead; it uses our own recursive inertial bisection implementation.