NOAA-OWP / ngen

Next Generation Water Modeling Engine and Framework Prototype
Other
83 stars 62 forks source link

Evaluating SCHISM BMI as a NextGen Formulation #547

Open jduckerOWP opened 1 year ago

jduckerOWP commented 1 year ago

Short description explaining the high-level reason for the new issue.

Steps to compile SCHISM model, BMI shared libraries, and BMI driver to test functionality (single thread)

  1. Please go ahead and download my latest forked version of SCHISM, which includes a BMI directory in the following pathway on GitHub: https://github.com/jduckerOWP/schism_NWM_BMI/tree/BMI_Branch
  2. Inside the forked version of my SCHISM branch above, please follow the instructions highlighted in the README.md file to compile the SCHISM mode, BMI shared libraries, and the SCHISM BMI driver.
  3. Once you have completed these steps, please go ahead and comment on this issue and I will provide a small SCHISM model setup to test the BMI driver for NextGen initiatives moving forward. We can go back and forth on requests needed by the model engine team to integrate the SCHISM BMI as a functioning model within the current NextGen model engine framework.
jduckerOWP commented 10 months ago

Updated documentation and Wiki pages have been created as a guide for compiling SCHISM source code and the BMI. Please refer to this moving forward SCHISM_BMI_wiki_pages

PhilMiller commented 10 months ago

Documenting where Jason and I have gotten with this:


I installed the necessary packages in a clean Ubuntu docker container

And linked

ln -s /usr/bin/python3 /usr/bin/python

because the SCHISM build system runspython in one or two spots to pre-process some things.


Enabled these variables in SCHISM to get behavior compatible with our intended BMI usage:


For library compilation to use -fPIC to appropriately produce code that can be loaded with dlopen(), rather than appending that to the CMAKE_{C,CXX,Fortran}_Flags variables (which produced build errors for me because of ; appearing in the resulting compilation command lines), I inserted

add_compile_options("-fPIC")

In the top-level SCHISM CMakeLists.txt


I posted one change to Jason's branch to replace literal directory paths with a CMake variable path pointing at the project source directory, https://github.com/jduckerOWP/schism_NWM_BMI/pull/1

PhilMiller commented 10 months ago

After clearing all of the above, Jason and I looked into the BMI implementation code and discussed initialization and coupling data flow. There's a challenge posed by the time integration scheme that SCHISM uses, in that it wants the atmospheric forcing values for a timestep ahead of its own execution. As initially written, the SCHISM BMI implementation takes input for variables with names Foo_t0 and Foo_t1 corresponding directly to its internal structure, and exposes those to the driver (eventually, the ngen model engine). My sense was that this would require an excess of special-case code in the model engine to properly feed SCHISM forcings data, so we instead decided to expose the relevant input variables without _t0/_t1 elaboration, and have the first step initialize both arrays. Jason is revising that now.

We also checked our mutual understanding of the coupling scheme between inland hydraulics and SCHISM. For now, we expect to have SCHISM strictly downstream of the coastal outflows from the inland hydraulic routing, with the routing unaffected by the water levels SCHISM predicts at the coastline. This will at least allow for a demonstration of the external data flows of the fully coupled system, with refinement to possible two-way coupling able to be developed later.

jduckerOWP commented 10 months ago

Hi Phil, I've gone ahead and finished all the required tasks on my end for you to move forward with the SCHISM BMI test case scenario as a NextGen formulation. The completed tasks are highlighted below:

  1. Revised the SCHISM BMI implementation to only advertise a single variable that will dynamically deal with internal SCHISM fields of "t0" and "t1". For the time being, the assumption here will be during the first hour of the model run, fields will remain statics (t0=t1) until NextGen model engine can advertise the following of hour of fields from other respective models. Revisions to the bmischism.f90 and the SCHISM BMI driver test (schism_BMI_driver_test.f90) were pushed into the development branch that you have previously compiled SCHISM_BMI_Branch
  2. A Lake Champlain BMI model setup (includes BMI configuration file) has been precompiled for your NextGen formulation test with the SCHISM model itself. I've zipped up the entire model setup and you can download it through my Google Driver here: SCHISM_Lake_Champlain_BMI_Model_Setup
  3. To support the offshore water levels for the SCHISM Lake Champlain model setup, we've had to develop a Python BMI application that will essentially advertise regridded water level fields to the boundary nodes of the SCHISM mesh. I've constructed this application essentially mimicking the ngen/extern/test_bmi_py directory within the ngen repository to create this required Python BMI to support realistic offshore water level fields for the SCHISM model. I've zipped up and uploaded the Python BMI application to the Google drive, which can be downloaded here for you: SCHISM_USGS_BMI_py_application

Once you've taken a look at all the products I've provided here, I would be happy to meet with you again at your convivence to game plan the NextGen formulation setup you intend to implement with the SCHISM Lake Champlain BMI model setup moving forward. Thanks again Phil for your work on this topic.

PhilMiller commented 9 months ago

Here are some notes on how the current code aligns with the conventions for BMI that ngen presently expects. These observations are not necessarily requests for or intentions to change, so much as discussion points.

https://github.com/NOAA-OWP/ngen/blob/master/doc/BMIconventions.md

Re "Time Control"

  1. The init config can specify a model_start_time other than 0, while the framework expects models to always treat that as 0. I expect we'll need to change this in the not-too-distant future, to align with time-stamped forcings, hot-starts, data assimilation, and so forth. Also, the variable isn't used for anything meaningful besides get_start_time
  2. The init config can specify model_end_time, which is also unused for anything meaningful except get_end_time
  3. There's both a dt from schism_glbl, and a time_step_size defined in the BMI interface. If SCHISM has an internal notion of its time step, it's probably preferable to maintain a single point of truth and just set/reference that
PhilMiller commented 9 months ago

Variable metadata: The set of variable covered by the various routines has some inconsistencies.

ETA2 ETA2_bnd RAINRATE SFCPRS SPFH2m TMP2m UU10m VV10m VX VY Q_bnd BEDLEVEL
var_grid y y y y y y y y y y y y
var_type y y y y y y y y y y y y
var_units y y y y y y y y y y y y
var_itemsize y y y y y y y y y y y y
var_nbytes
var_location y y y y y y y y y y y y
input_vars y y y y y y y y
set_double y y y y y y y y
get_double y y y y y
output_vars y y y y

The potentially problematic things I see in here are

  1. Missing var_type for ETA2_bnd and Q_bnd (implemented in https://github.com/jduckerOWP/schism_NWM_BMI/pull/4)

It's reasonable (for now) that only input variables are available to set, and only output variables are available to get. That may change down the line as we go to support serialization

PhilMiller commented 7 months ago

For the sake of guiding work on integrating SCHISM with NextGen (mostly stuff I expect to be doing), here's a broad list of things that I see needing to get done along the way

I'll open and link issues for some of these as I approach them.

PhilMiller commented 6 months ago

Based on the change in #712, SCHISM won't have to create any sub-communicator at all. It'll just get the handled passed by the framework, and use it directly, since it'll be in the form the MPI Fortran bindings expect.

PhilMiller commented 6 months ago

The SCHISM BMI code will eventually get merged to the schism-dev/schism-esmf. The blocking step is having our MPI code in the BMI code, so I'll prioritize that.

PhilMiller commented 5 months ago

After digging into the ESMF MPI_Finalize issues, I checked and SCHISM has a very similar issue - its parallel_finalize unconditionally calls MPI_Finalize. That's actually all it does, so we could just drop the call to parallel_finalize from the BMI finalize routine, but that would not necessarily be a durable solution.