dtcenter / MET

Model Evaluation Tools
https://dtcenter.org/community-code/model-evaluation-tools-met
Apache License 2.0
78 stars 24 forks source link

Add support for ECMWF BUFR data using external tables. #926

Open dwfncar opened 6 years ago

dwfncar commented 6 years ago

Based upon meetings with ECMWF staff, the ECMWF BUFR tables are external to the BUFR data files. This task is to enhance MET as follows:
(1) Enable pb2nc to read ECMWF BUFR files.
(2) Determine the BUFR table that specified for this data.
(3) Check if the specified BUFR table is included with the BUFR file.
(4) If so, use it.
(5) If not, look for the MET_BUFR_TABLES environment variable which specifies a directory containing BUFR tables.
(6) Read the specified BUFR table and use it to interpret the data.


The ECMWF BUFR tables are available here:
   https://software.ecmwf.int/wiki/display/BUFR/BUFRDC+Home


That website contains a tarball of BUFR tables used by ECMWF along with many other institutions around the world. Rather than redistributing the BUFRDC tables ourselves, just point users to where they can download them.


Also, suggest enhancing PB2NC so that if the BUFR table is not included with the data and an external table can't be located in the MET_BUFR_TABLES directory, print a useful error message. [MET-926] created by johnhg

Charge Key: 2799991

dwfncar commented 6 years ago

Moving down to major because I think it can wait until June 2018 by jensen

TaraJensen commented 5 years ago

Charge 2799991

JohnHalleyGotway commented 3 years ago

Moved to 10.1 since this is not needed by UK Met Office.

hsoh-u commented 1 year ago

This can be done by using python embedding. There is a python package. PyBufrKit \& Documenation.

John-Sharples commented 1 year ago

@hsoh-u do you have an example of how to implement python embedding for pb2nc? Specifically, what object is it expecting the python script to create? e.g. python embedding for grid_stat needs a numpy array named met_data plus an attrs dict. What does pb2nc expect?

Apologies if this is already in the docs somewhere. I can't seem to find it.

hsoh-u commented 1 year ago

Unlike ascii2nc, the python embedding for pb2nc is not supported. pb2nc was implemented with APIs from the PREPBUFR library. The python embedding for ECMWF BUFR should do what pb2nc does for PREPBUFR in python. Two options are available for the python object for MET point observation data.

  1. Make CSV output with 'typ', 'sid', 'vld', 'lat', 'lon', 'elv', 'var', 'lvl', 'hgt', 'qc', 'obs' columns (string columns: 'typ', 'sid', 'vld', 'var', 'qc') and call ["MET_BASE/python/read_ascii_point.py"] (https://github.com/dtcenter/MET/blob/main_v11.0/scripts/python/read_ascii_point.py)
    • read_ascii_point.py supports white-space separated values
    • create your own python script from read_ascii_point.py to support comma separated values
  2. Build a python dictionary named "met_point_data"
    • 1 dimensional arrays for lat, lon, elv, msg_type_idx, station_id_idx, valid_time_idx
    • 1 dimensional arrays for header_id, variable_id_idx, height, vlevel, qc_idx, obs_value
    • 1 dimensional string arrays for msg_types, station_ids, valid_times, QC_strings, variable_names

We are working to simplify the method 2