noaa-ocs-modeling / OCSMesh

OCSMesh is a mesh preparation tool for coastal ocean modeling applications.
https://noaa-ocs-modeling.github.io/OCSMesh/
Creative Commons Zero v1.0 Universal
13 stars 8 forks source link

End-to-end Auto Mesh Gen #165

Open felicio93 opened 4 months ago

felicio93 commented 4 months ago

General Plan:

The plan is to create an fully automated mesh generation tool for the land-ocean continuum that takes advantage of SCHISM's flexibility. That encompasses the ability of creating individual meshes for the floodplain, river, and ocean domains, following their specific mesh generation requirements, as well as emerging them accordingly.

Development Team: @felicio93, @SorooshMani-NOAA, @feiye-vims, @hyungjuyoo, @saeed-moghimi-noaa, @josephzhang8,

Software Development

Testing

Documentation


Development Iterations

We organize the development in 2 months iterations for which we define a Minimum Viable Product. Each iterations is a separate ticket. Feel free to add your suggestions/priorities. The Iteration 00 was completed with the successful validation of our first complete 2D mesh: https://github.com/SorooshMani-NOAA/river-in-mesh/issues/3

josephzhang8 commented 4 months ago

Thx @felicio93. I'd highly encourage you to write this up for Ocean Modelling Special Issue. It's highly relevant. Thx

SorooshMani-NOAA commented 4 months ago

@felicio93 the generic OCSMesh focused tasks I had in mind are:

hyungjuyoo commented 4 months ago

Hi, I am Hyung Ju, Yoo (post-doc) from Joseph's group.

Thank you for inviting me.

felicio93 commented 4 months ago

tagging @janahaddad for reference

pbranson commented 3 months ago

Chiming in here, I work at CSIRO in Australia and have been testing out OCS mesh these past few weeks, with mostly positive results - thanks for your work on this!

@felicio93 the generic OCSMesh focused tasks I had in mind are:

  • Parallelization of size function calculation (collector type). The issue is that hfun object is not picklable due to reference to files object
  • Parallelizing using Dask; right now the parallelization is limited to a single node.

Can I suggest that you take a look at using FSSpec for file handling, it allows for a uniform interface between POSIX and cloud storage, are generally lazy locking, which allows for pickling of python file-like objects that are also supported by rasterio: see https://github.com/rasterio/rasterio/issues/2905 These aspects also help with dask

On the dask front, you could consider using https://corteva.github.io/rioxarray/html/rioxarray.html to interface with rasterio which provides an xarray container overlay on rasterio that can leverage all of xarray+dask computational graph generation. It would likely be some effort to transition away from the current "windows" approach, so you could perhaps maintain that for multi-threaded processing within larger dask-chunks - this kind of approach can actually be beneficial as it reduces the overall size of the dask graph which can become an overhead for very large datasets.