dtcenter / METplus

Python scripting infrastructure for MET tools.
https://metplus.readthedocs.io
Apache License 2.0
98 stars 37 forks source link

New Use Case: Verify Total Column Ozone against NASA's OMI dataset #1989

Closed AliciaBentley-NOAA closed 6 months ago

AliciaBentley-NOAA commented 2 years ago

Describe the New Use Case

Create a new use case to verify Total Column Ozone against NASA's OMI dataset. The OMI data are in HDF-EOS5 (.he5) format. Use python embedding to ingest these gridded observations and run GridStat to verify against GFS and/or GEFS model output.

Use Case Name and Category

Provide use case name, following Contributor's Guide naming template, and list which category the use case will reside in. If a new category is needed for this use case, provide its name and brief justification

Input Data

Data on WCOSS2 are downloaded from NASA, so the data are the same. (On WCOSS2) /lfs/h1/ops/dev/dcom/YYYYMMDD/validation_data/aq/omi/omiaura${YYYY}m${mmdd}_v883-${YYYY}m${mmdd}t${HHMMSS}.he5 (From NASA) https://omisips1.omisips.eosdis.nasa.gov/outgoing/OMTO3e/OMI-Aura_L3-OMTO3e_${YYYY}m${mmdd}_v883-${YYYY}m${mmdd}t${HHMMSS}.he5

For example: https://omisips1.omisips.eosdis.nasa.gov/outgoing/OMTO3e/OMI-Aura_L3-OMTO3e_2023m0720_v883-2023m0725t160312.he5

A test of functionality could involve a small use case using the OMI dataset for Total Column Ozone. This parameter can be found in GFS and GEFS forecasts.

Note that the python embedding solution must work on WCOSS2. The python packages must either already be approved to run there or seek approval to add new one(s).

Acceptance Testing

Describe tests required for new functionality. As use case develops, provide a run time here

Time Estimate

Unsure. Issues should represent approximately 1 to 3 days of work.

Sub-Issues

Consider breaking the new feature down into sub-issues.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

Labels

Projects and Milestone

Define Related Issue(s)

Consider the impact to the other METplus components.

New Use Case Checklist

See the METplus Workflow for details.

JohnHalleyGotway commented 2 years ago

Recommend considering this feature request in the larger context of the libraries upon which MET already depends.

As of version 11.0.0, MET depends on:

The lidar2nc tool requires the HDF4 library. The modis_regrid tool requires the very old hdfeos version 2 library.

Adding support for this HDF-EOS5 data will like cause confusion/conflicts with the earlier HDF library versions used by these tools. Recommend that part of this work be to revisit these tools and consider:

malloryprow commented 11 months ago

I have a script started to read the data at /scratch1/NCEPDEV/global/Mallory.Row/VRFY/METplus_OMI/read_omi-aura_l3-omto3e.py. I tried basing it off of read_rtofs_smap_woa.py.

I haven't set up or tried running it with python embedding.

malloryprow commented 11 months ago

Should I try to get a use case set up with this?

georgemccabe commented 11 months ago

@malloryprow, yes, that would be great! Please let me know if you have any questions/issues getting the use case set up.

malloryprow commented 11 months ago

I got the basics of a use case working. There are some details to hammer out, but wanted to mention this first.

I'm running on Hera. When I use MET 12.0.0-beta1 I see this in my METplus log

DEBUG 1: Start grid_stat by Mallory.Row(20074) at 2023-12-13 17:07:49Z cmd: /contrib/met/12.0.0-beta1/bin/grid_stat -v 2 /scratch1/NCEPDEV/global/Mallory.Row/VRFY/METplus_OMI/testing/input/model_applications/medium_range/grid_to_grid/gfs/fcst/pgbf24.gfs.2023120312 PYTHON_NUMPY /scratch1/NCEPDEV/global/Mallory.Row/VRFY/METplus_OMI/METplus/parm/met_config/GridStatConfig_wrapped -outdir /scratch1/NCEPDEV/global/Mallory.Row/VRFY/METplus_OMI/testing/output/met_out/GFS/ozone/202312041200/grid_stat DEBUG 2: OpenMP running on 1 thread(s). DEBUG 1: Default Config File: /contrib/met/12.0.0-beta1/share/met/config/GridStatConfig_default DEBUG 1: User Config File: /scratch1/NCEPDEV/global/Mallory.Row/VRFY/METplus_OMI/METplus/parm/met_config/GridStatConfig_wrapped Traceback (most recent call last): File "/contrib/met/12.0.0-beta1/share/met/python/met/dataplane.py", line 5, in import xarray as xr File "/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/xarray/init.py", line 1, in from xarray import testing, tutorial File "/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/xarray/testing.py", line 7, in import pandas as pd File "/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/pandas/init.py", line 48, in from pandas.core.api import ( File "/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/pandas/core/api.py", line 47, in from pandas.core.groupby import ( File "/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/pandas/core/groupby/init.py", line 1, in from pandas.core.groupby.generic import ( File "/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/pandas/core/groupby/generic.py", line 76, in from pandas.core.frame import DataFrame File "/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/pandas/core/frame.py", line 171, in from pandas.core.generic import NDFrame File "/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/pandas/core/generic.py", line 169, in from pandas.core.window import ( File "/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/pandas/core/window/init.py", line 1, in from pandas.core.window.ewm import ( File "/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/pandas/core/window/ewm.py", line 15, in import pandas._libs.window.aggregations as window_aggregationsImportError: /apps/spack/linux-centos7-x86_64/gcc-4.8.5/gcc-9.2.0-wqdecm4rkyyhejagxwmnabt6lscgm45d/lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/lib/python3.10/site-packages/pandas/_libs/window/aggregations.cpython-310-x86_64-linux-gnu.so) WARNING:WARNING: straight_python_dataplane() -> an error occurred importing module "/scratch1/NCEPDEV/global/Mallory.Row/VRFY/METplus_OMI/METplus/parm/use_cases/model_applications/medium_range/GridStat_fcstGFS_obsOMI_TotalColumnOzone/read_omi-aura_l3-omto3e" WARNING: ERROR : ERROR : Trouble reading data from observation file "PYTHON_NUMPY" ERROR :

When I switch to using MET 11.1.0 I don't have this issue.

malloryprow commented 11 months ago

I have the use case set up. I have been working on the documentation, but I'm not sure there is a way on Hera to check out the documentation changes I have made. Is there a python installation or conda environment I can load to do so?

jprestop commented 11 months ago

@malloryprow, we use the following conda environment on Hera for METplus:

/scratch1/BMC/dtc/miniconda/miniconda3/envs/metplus_v5.1_py3.10/bin

malloryprow commented 11 months ago

Thank you @jprestop!

georgemccabe commented 11 months ago

I don't know if there is a python environment on hera that has the required sphinx packages to build the documentation, but you can commit your changes to a branch with a name that starts with feature_ and push your changes to GitHub. That will automatically build the documentation on ReadTheDocs which you can view by going to the URL https://metplus.readthedocs.io/en/branch_name replacing branch_name with the name of your branch in all lower-case.

jprestop commented 11 months ago

Thanks @georgemccabe. I totally missed that she wanted to check the documentation and read only this "Is there a python installation or conda environment I can load"? Apologies, Mallory.

Please follow George's instructions. :)

malloryprow commented 11 months ago

I did a fork of METplus for my work. Will it still work? (https://github.com/malloryprow/METplus/tree/feature_1989_OMIUseCase)

georgemccabe commented 11 months ago

Ah, it will only automatically generate the docs if you are using dtcenter/METplus. You may be able to turn it on by going to readthedocs.org, logging in with your GitHub credentials, then click Import a Project to add your forked repo. It may be easier to just push your changes to the dtcenter/METplus repo and let it run automatically.

malloryprow commented 11 months ago

Hmmmm, okay I'll try that! I may try to see if I can get the conda environment working on my workstation (I'm in the office today and was at home on Friday).

malloryprow commented 11 months ago

Ended up putting the branch in dtcenter/METplus (https://github.com/dtcenter/METplus/tree/feature_1989_OMIUseCase).

Documentation built successfully (https://metplus.readthedocs.io/en/feature_1989_omiusecase/).

malloryprow commented 11 months ago

@georgemccabe @jprestop I think someone will need to complete Adding new data to full sample data tarfile for me. I put GridStat_fcstGFS_obsOMI_TotalColumnOzone.tgz and feature_1989_OMIUseCase.bash on ftp.rap.ucar.edu in /incoming/irap/met_help/row_data.

georgemccabe commented 11 months ago

Hi @malloryprow, I can take care of that step for you.

malloryprow commented 11 months ago

Thanks! Let me know if you have any trouble. Once that is done, I think I will be good to submit a PR for this? That is the next section it looks like.

georgemccabe commented 11 months ago

The data is staged for the automated tests. It would be a good idea to make sure the use case runs without issue in the automated tests using the staged data. If you haven't already added the use case to the test suite, the instructions for that start here: https://metplus.readthedocs.io/en/latest/Contributors_Guide/add_use_case.html#add-use-case-to-the-test-suite

If you have already added those files, you can push any change to your branch in the dtcenter/METplus repo and it should kick off your use case in GitHub Actions. Once you have confirmed that the use case runs without any issues, you can create a pull request.

malloryprow commented 11 months ago

It looks like all is well?

https://github.com/dtcenter/METplus/actions/runs/7252874897

georgemccabe commented 11 months ago

Yes, looks good to me!

malloryprow commented 6 months ago

Reopening to add documentation for OMI to Verification Datasets Guide