dtcenter / METplus

Python scripting infrastructure for MET tools.
https://metplus.readthedocs.io
Apache License 2.0
98 stars 37 forks source link

New Use Case: StatAnalysis Python Embedding to read native grid (u-grid) #1561

Closed JohnHalleyGotway closed 1 year ago

JohnHalleyGotway commented 4 years ago

Describe the New Use Case

Note that this was originally an issue in MET with the plan of adding support directly to the C++ tools. However, as of 4/6/2022, the plan has changed to leveraging existing python functionality for an initial implementation. Along with that, we'll need a new use case to demonstrate that functionality. In the long run, enhancements could be added directly to MET to support unstructured grids, but those would be described in a different issue.

Here's the original text of this issue for background:

This issue is the result of a meeting with Mark Miesch about JEDI on 1/23/2020.

Consider enhancing MET to leverage the JEDI C++ interface for reading data from LFRic, FV3, MPAS, Neptune, WRF, and SOCA. The geometry object in JEDI can instantiate a grid.

JEDI Source Code (not public): https://github.com/JCSDA/oops

State.h header file defines the State class. Each model instantiates a State in a different way, including grid information and parallel distribution information. The State object will retrieve a model value at a given location by calling getValues(). This returns a GeoVaLs object for the model values at the requested location(s). The "State::read(const eckit::Configuration &)" member function can read data from a model output file and populates the state object. The "State::geometry()" member function contains the grid information.

Dependencies: ecmwf/eckit, oops, FV3 JEDI, oops depends on boost headers (not actually compiled). JEDI is built using ecbuild. ecmwf/eckit does all the MPI handling.

Some functional steps...

Here's a specific idea: (1) Work on Hera. (2) Enhance MET to support a new configurable option to interface with JEDI from FV3. (3) Specifically, enhance Point-Stat to call Shape::getValues() for each observation location and generate matched pairs. (4) Compute the resulting statistics.

So really "JEDI" is a new input "file type"... from which you extract forecast values.

This is needed by September 30, 2020 (end of Q4).

Use Case Name and Category

Provide use case name, following Contributor's Guide naming template, and list which category the use case will reside in. If a new category is needed for this use case, provide its name and brief justification

Input Data

List input data types and sources. Provide a total input file size, keeping necessary data to a minimum.

Acceptance Testing

Describe tests required for new functionality. As use case develops, provide a run time here

Time Estimate

Estimate the amount of work required here. Issues should represent approximately 1 to 3 days of work.

Sub-Issues

Consider breaking the new feature down into sub-issues.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

Labels

Projects and Milestone

Define Related Issue(s)

Consider the impact to the other METplus components.

New Use Case Checklist

See the METplus Workflow for details.

dwfncar commented 4 years ago

See dakota:/d3/projects/JEDI

JohnHalleyGotway commented 3 years ago

On 6/2/2021, @j-opatz @TaraJensen and @JohnHalleyGotway met with Mark Miesch to discuss MET interfacing with JEDI.

Could be worthwhile to attend a JEDI training academy. JEDI documentation: https://jointcenterforsatellitedataassimilation-jedi-docs.readthedocs-hosted.com/en/latest/

bump and saber would be good libraries with which to interface. Deadlines, we need work done by the end of September. Links to recorded video lectures from prior academies: https://www.jcsda.org/jedi-academies

JEDI is really spread over many repositories... 8 or 9 of them. New release JEDI-FV3 version 1.0 and IODA 2.0 coming mid-June. IODA 2.0 does not yet fully support MET-office ODB but there's been a lot of progress. Getting closer. Internal repository, IODA-Converters can ready ODB files but they'll eventually be readable directly.

Questions.

Recommend that we move development up to Cheyenne, run JEDI there, and demonstrate that we can interface with it there. The h-of-x application writes out h-of-x files. The h-of-x application is part of oops. It writes output to a file. MET could be enhanced to read matched pairs from those files. Eventually, MET could potentially read this data from memory rather than via a temporary file.

The Atlas package from ECMWF is being incorporated into JEDI. It can be used to create grids and grid meshes for subsetting tasks for mpi: https://github.com/ecmwf/atlas Working to get cubed-sphere incorporated into Atlas. But it is not yet fully integrated because it does not yet fully support interpolating between all grid types, masking, and adjoints.

mpm-meto commented 3 years ago

Some sample LFRic netcdf files attached for testing. This is the data originally sent in March 2021. I have asked if there is something more up to date. I do not know whether this has been tested with the JEDI libraries. Awaiting further info. sample_data.zip

mpm-meto commented 3 years ago

Further info/update wrt use of JEDI-BUMP from someone working on JEDI in Met Office.

The good news. There currently is a LFRic model interface that uses BUMP for interpolation to observation locations.

The bad news. 1) The API to BUMP keeps on changing! The owner of BUMP keeps applying large changes to BUMP that can involve code changes of the order 200 > files which makes it very hard to check that it is doing what it is supposed to be doing and maintaining the interface. 2) As far as I know, unless it has changed, it assumes that all data is collocated horizontally and vertically. It will not deal with the vertical stagger. 3) There is no NetCDF reader of LFRic files within SABER/BUMP. Some of the other models do have this.

TaraJensen commented 3 years ago

Here's a new sample of LFRic - obtained in September. Philip Gill also says: On the LFRic grid side I’m attaching some updated canned metadata. Stuart Whitehouse is the best Met Office contact for this and again is happy to talk to your developers.

canned_metadata_lfric_r30711.zip

mpm-meto commented 2 years ago

Some clarification: the term "native" can refer to any kind of grid, unstructured (ugrid) or structured mesh (lat-lon) but simply refers to the mesh that the model is integrated on. Often times we do not verify our forecasts on the native (structured) grid, e.g. WMO CBS statistics are calculated on a 1.5deg regular lat-lon grid. This is not the native grid for any of the global NWP models that submit/exchange statistics.

As far as this issue goes I think it will soon be time to break into sub-issues, and treat this issue as the overarching one. I see at least two sub-issues (for now) addressing distinct tasks which can be completed in isolation of each other: 1) the ugrid-to-ugrid comparison. Having both the model analyses and the forecasts on the same native (ugrid) and comparing the two. This does NOT require ANY interpolation. It does require the ability to read the ugrid netcdf. These are effectively vectors of lat, lon, value files and are more like a station list for point stat than anything we associate with a regular grid. There is work that needs to be done to see how far we can get in terms of treating these ugrids like a very dense observing network and understand the computation performance of following such a path and what optimisations may be possible. 2) internal interpolation of ugrid to regular grid. Internalising the ability to regrid ugrid to another (any other?) regular grid is probably desirable. There are python libraries (esmf) which can do the regridding for you before you feed your fields into MET so if fields are needed on a regular grid it is possible doing this before leveraging the stats in MET, i.e. it won't preclude the use of MET if an internal conversion is not possible in the short-term. It may also be worth exploring whether this could work in a python embedding sense though the computational costs of e.g. stand alone vs embedding would need to be explored alongside the value of keeping regridded fields on disk for other uses/users (as most downstream applications will be using ugrid data in a regridded format). Either way, 1) above would appear to address the issue of reading in a ugrid file, which is the first hurdle to overcome, and then figuring out what to do with it within MET.

TaraJensen commented 2 years ago

ESMpy might be a possible solution to reading the u-grid

TaraJensen commented 2 years ago

Marion sent scripts for vinterp and hinterp to Will and Tara. She's uncertain if they should be attached to the open GitHub issue. Will attach here if it is found worthwhile and cleared for doing so.