Building Model_mod.F90 to glue NUOPC DART cap with DART software

Sumanshekhar17 commented 3 months ago

This module is designed to work with DART (Data Assimilation Research Testbed) and seems to be interfacing with the DART NUOPC cap. This issue is showing the background and approach to create this capability.

Sumanshekhar17 commented 3 months ago

Module Overview The model_mod module is responsible for interfacing a specific model (model like MOM6) with DART. It includes routines to initialize the model, manage state variables, perform interpolation, and handle various operations required during data assimilation. But we are building this to interface with DART NUOPC cap which has the access to the memory address of model state variable.

here is an example for model_mod module for MOM6 https://github.com/DART-NUOPC/DART/blob/95795a4126a49a473b7fb458dd2579e3d861a8d3/models/MOM6/model_mod.f90#L11-L21C2

Key Subroutines and Functions

`subroutine static_init_model()`

Purpose: Performs one-time initialization of the model. This includes reading configuration files, setting up grid information, and preparing the model state structure.
Main Operations:

Reads the model configuration from input.nml.
Initializes time settings for data assimilation.
Verifies that state variables are correctly defined.
Reads grid information (horizontal and vertical) and ocean geometry (land vs. ocean).

`function get_model_size()`

Purpose: Returns the size of the model state vector. This function ensures the model is initialized and then retrieves the number of elements in the state vector.
Main Operation: Calls get_domain_size(dom_id) to get the size of the model's state vector.

`subroutine model_interpolate()`

Purpose: Interpolates state variables at a given location to estimate expected observation values. This is used during the assimilation process to compare model state with observations.
Main Operations:

Identifies which grid the requested quantity is on (e.g., U, V, or T grid).
Locates the 4 corners of the grid cell surrounding the observation point.
Performs bilinear interpolation on the grid to estimate the state at the observation location.
Handles cases where the observation is on land or outside the model domain.

`function shortest_time_between_assimilations()`

Purpose: Returns the shortest time increment the model can advance, which is used to set the assimilation window.
Main Operation: Simply returns the assimilation_time_step.

`subroutine get_state_meta_data()`

Purpose: Retrieves metadata (location and quantity type) for a given state vector index.
Main Operations:

Converts the state vector index to grid indices (longitude, latitude, and level).
Determines the geographical location (longitude, latitude) of the state.
Optionally returns the type of quantity (e.g., temperature, salinity) at that location.

`subroutine convert_vertical_state()`

Purpose: Converts vertical levels from the model's representation (e.g., layers) to physical heights.
Main Operations:

Retrieves the thickness of each model layer to compute the physical depth.
Converts the model’s vertical level to an actual height (in meters).

`subroutine get_close_obs()`

Purpose: Finds observations close to a given model state location. This is important for determining which observations to assimilate.
Main Operations:

Calls loc_get_close_obs, a utility that calculates distances between the model state and observations.
Applies additional logic to exclude observations that are on land or in inappropriate locations.

`subroutine get_close_state()`

Purpose: Similar to get_close_obs, but for finding model state locations close to a given observation location.
Main Operations:

Computes distances between a base location and state locations.
Excludes states that are on land or below the sea floor from being considered "close."

`subroutine end_model()`

Purpose: Handles any clean-up needed when the model run is finished. In this case, it’s a placeholder and does not perform any operations.

`subroutine nc_write_model_atts()`

Purpose: Writes additional model-specific attributes to a NetCDF file. This could include metadata like the model's source or version.
Main Operations:

Puts the NetCDF file into define mode.
Adds global attributes (e.g., model source) to the file.

`subroutine read_horizontal_grid()`

Purpose: Reads the longitude and latitude values for the model grids (T, U, V grids) from a static NetCDF file.
Main Operations:

Opens the NetCDF file and reads the grid information.
Converts longitude values that are out of the range [0, 360].

`subroutine read_num_layers()`

Purpose: Reads the number of vertical layers in the model from a template NetCDF file.
Main Operation: Retrieves the size of the Layer dimension in the NetCDF file.

`subroutine read_ocean_geometry()`

Purpose: Reads static ocean geometry data (e.g., land/sea mask, basin depth) from a NetCDF file.
Main Operation: Allocates and reads the wet and basin_depth arrays, which indicate where the ocean is and how deep it is.

`function on_land_quad()`

Purpose: Checks if all four corners of a grid cell are on land.
Main Operation: Returns .true. if any corner is on land; otherwise, returns .false..

`function on_land_point()`

Purpose: Checks if a specific point (grid cell) is on land.
Main Operation: Returns .true. if the point is on land; otherwise, returns .false..

`function on_basin_edge()`

Purpose: Checks if a grid cell is near the edge of an ocean basin.
Main Operation: Compares the basin depth at each corner of a grid cell to the depth of the observation. If any point is deeper than the observation, the function returns .true..

`function below_sea_floor()`

Purpose: Checks if a given depth is below the sea floor.
Main Operation: Compares the provided depth with the basin depth at the grid cell and returns .true. if the depth is greater.

`subroutine setup_interpolation()`

Purpose: Initializes interpolation handles for T, U, and V grids, which will be used in interpolation calculations during assimilation.
Main Operations:

Calls init_quad_interp to set up interpolation handles for the T, U, and V grids.
Configures interpolation parameters, such as grid coordinates and whether the grid wraps around the poles.

`function get_interp_handle()`

Purpose: Returns the appropriate interpolation handle based on the quantity type (e.g., U, V, or T grid).
Main Operation: Determines which interpolation grid to use and returns the corresponding handle.

`subroutine verify_state_variables()`

Purpose: Ensures that the state variables specified in the namelist are valid and properly configured.
Main Operations:

Checks that all entries in the namelist are fully specified.
Verifies that each state variable corresponds to a valid quantity.
Ensures that the update mode (e.g., UPDATE, NO_COPY_BACK) is correctly specified.

`function read_model_time()`

Purpose: Reads the model's current time from a NetCDF file and converts it to the DART time format.
Main Operation: Reads the Time variable from the NetCDF file, converts it from the model's time base to DART's time base, and returns it as a time_type.

Sumanshekhar17 commented 3 months ago

DART interacts with NetCDF files primarily for:

Reading static model configuration data: Such as grid information and ocean geometry. Reading and writing model state information: Such as the number of layers, model time, and possibly other state-related variables. Writing model metadata: Such as global attributes related to the model source and configuration. The nc_open_file_readonly and nc_get_variable calls are key functions where DART is interacting with NetCDF files to retrieve or store data, making them crucial points of access in the code for handling model state and configuration.

Sumanshekhar17 commented 2 months ago

Get the domain size of each state variable has on processor with Id=0:

! retrieve the Fortran data pointer from the Field
call ESMF_FieldGet(field=field, localDe=0, farrayPtr=farray1, rc=rc)
if (rc /= ESMF_SUCCESS) call ESMF_Finalize(endflag=ESMF_END_ABORT)

! retrieve the Fortran data pointer from the Field and bounds
call ESMF_FieldGet(field=field, localDe=0, farrayPtr=farray1, &
    computationalLBound=compLBnd, computationalUBound=compUBnd, &
    exclusiveLBound=exclLBnd, exclusiveUBound=exclUBnd, &
    totalLBound=totalLBnd, totalUBound=totalUBnd, &
    computationalCount=comp_count, &
    exclusiveCount=excl_count, &
    totalCount=total_count, &
    rc=rc)

This would give the extent of the computational domain is available on processor ID=0

Get the geographical location (longitude, latitude) of the state:

DART-NUOPC / NUOPC_DART-code-development