unifhy-org / unifhy

A Unified Framework for Hydrology
https://unifhy-org.github.io/unifhy
BSD 3-Clause "New" or "Revised" License
12 stars 5 forks source link

Find solution to use Components both on structured and unstructured grids #29

Open ThibHlln opened 3 years ago

ThibHlln commented 3 years ago

[Related to #12]

At the moment, only structured grids are implemented as SpaceDomain, so effectively they are stored as 2D arrays. This means that a science Component knows the rank of the array it is going to receive, and it knows where its neighbours are (e.g. useful for stencils, or for routing).

Moreover, for models like the current implementation of JULES, 2D grids can be vectorised easily since it is working on vertical columns with no lateral flow (hence no need to know where neighbours are). Therefore, JULES is readily usable on unstructured grids (e.g. the cubed-sphere of LFRic), and it would be a shame to convert 2D arrays to vectors (1D array) as part of the Component, and it would be better if the framework could provide the information as vectors directly.

So this issue is to suggest that we may want to consider/store all SpaceDomain as vectors, so that any geometry can be supported without for the Component to have to adjust to it: a Component would always receive a vector alongside information about where its neighbours are.

ThibHlln commented 3 years ago

While structured grids are most likely going to be stored as 2D arrays (e.g. Y, X in CF-conventions), unstructured grids can be stored as vectors, as an array with FillValue, or as ragged arrays (e.g. UGRID-conventions). Then, it may be a good idea to use a UGRID vector approach for both structured and unstructured grids internally in the framework for future proofing (i.e. future support for unstructured grids).

For the variables themselves, a structured grid could be converted easily from a 2D array to a vector with e.g. numpy.flatten, and brought back from a vector to a 2D array with e.g. numpy.reshape. Then, working on the vector, if neighbouring relationships are required. This could work as follows:

I think that if a Z dimension is required for the inputs of a component, it should be kept separate from Y/X: only Y/X should be vectorised.

ThibHlln commented 3 years ago

Another aspect we could consider as part of this issue is the fact that the SpaceDomain.land_sea_mask is not used to subset the domain to land locations only, it is sending the whole domain, and the component can access the land_sea_mask if they want to do the subset themselves. Afterall, cm4twc is about the Terrestrial water cycle, so this should probably become the default approach to exclude the sea points before giving the variables to the components.