ecmwf / atlas-orca

atlas-orca plugin for Atlas, providing support for ORCA grids and mesh generation
Apache License 2.0
3 stars 6 forks source link

Add halo during mesh generation #4

Closed twsearle closed 1 year ago

twsearle commented 2 years ago

Description

The ORCA grids have halo cells included in the grid. This makes building halos in the usual way problematic, as the mesh already contains duplicate nodes.

To get around this, I am attempting to construct the halo during the mesh generation, and set "halo_locked" to prevent failures in BuildHalo.cc.

The mesh generator creates a "surrounding rectangle" around all indices that should be on a given partition. I have extended this to include the additional nodes out to the halo width. Later, I include nodes that are within halo width of the current partition as nodes within the mesh.

Testing

When I attempt to plot the nodes used in a halo by other partitions I seem to have a band around the perimeter that is not included. For example, partition 1 with 6 processors and a halo of 2 gives me the following image:

I would expect to have the edge nodes to have a halo of 1, and then the next nodes in to have a halo of 2, instead there is a band of halo 0 around the outside edge. I have compared these images with the halo variable that is included as part of the gmsh info file output, and I am not exactly sure if this is an issue! It is possible everything is working correctly, but if there is a better way to test please let me know.

I have also tested this by reading and writing ORCA data from file into and out of atlas running on 4 and 2 processors with the equal-regions partitioner. I am doing this by converting the ij indices back to the native orca indices (including the orca halo points because these are also included in the output files):

  const int glb_i = ij(inode, 0) + orcaGrid_.haloWest();
  const int glb_j = ij(inode, 1) + orcaGrid_.haloSouth();

This reveals an additional discrepancy at some of the grid fold/seam points near the north poles I think. For example for ORCA2 the relevent points are:

 glb_i 89  glb_j 146  lon 260 lat 69.7133
 glb_i 90  glb_j 146  lon 260 lat 69.7133
 glb_i 181 glb_j 146  lon 440 lat 49.9789
 glb_i 0   glb_j 147  lon 80 lat 50.5255
 glb_i 89  glb_j 147  lon 260 lat 70
 glb_i 90  glb_j 147  lon 260 lat 70
 glb_i 92  glb_j 147  lon 260 lat 70
 glb_i 93  glb_j 147  lon 260 lat 70
 glb_i 94  glb_j 147  lon 260 lat 70
 glb_i 95  glb_j 147  lon 260 lat 70
 glb_i 96  glb_j 147  lon 260 lat 70
 glb_i 97  glb_j 147  lon 260 lat 70
 glb_i 98  glb_j 147  lon 260 lat 70
 glb_i 99  glb_j 147  lon 260 lat 70
 glb_i 100 glb_j 147  lon 260 lat 70
 glb_i 101 glb_j 147  lon 260 lat 70
 glb_i 0   glb_j 148  lon 80.221 lat 50.5161
 glb_i 1   glb_j 148  lon 80 lat 49.9789
 glb_i 89  glb_j 148  lon 260 lat 69.7133
 glb_i 90  glb_j 148  lon 260 lat 69.7133
 glb_i 91  glb_j 148  lon 260 lat 69.7133
 glb_i 92  glb_j 148  lon 260 lat 69.7133
 glb_i 93  glb_j 148  lon 260 lat 69.7133
 glb_i 180 glb_j 148  lon 440.221 lat 50.5161
 glb_i 181 glb_j 148  lon 440 lat 49.9789

These points result in missing data that was not being transferred by the halo swap: orca2_using_buildParallelFields

I can resolve these points using the changes prototyped in test_atlas_orca_mesh::build_periodic_boundaries, and turning off the generic buildParallelFields and buildPeriodicBoundaries by setting mesh.nodes().metadata().set( "parallel", true ); mesh.metadata().set( "periodic", true );:

orca2_no_parafields

Remaining issues

twsearle commented 1 year ago

Apologies for abandoning this attempt to fix this issue! I have had another go in the last few weeks to address our small MPI issues. I have a much smaller PR that I think will help a bit, so I will close this one. If it goes well I will add another PR to handle the no-halo checkerboard/equal_regions case, and then a final PR for the halo size >1 issues. https://github.com/ecmwf/atlas-orca/pull/15