wmo-im / monitoring

2 stars 0 forks source link

Monitor correctness of messages #17

Closed kaiwirt closed 2 years ago

kaiwirt commented 3 years ago

The monitoring centers need to check the correctness of messages in terms of file format, coding, quality.

kaiwirt commented 3 years ago

An assessment of quality can be the difference between raw data and nwp data

kaiwirt commented 3 years ago

Comment from @EgawaTakumu I think it's important to check the data format, but it's sometimes difficult to follow the new BUFR master table version. We are using JAVA library but it seems to have some bugs....

czanna commented 3 years ago

within the scope of TAC to BUFR migration one of checks that we tried to automate was to compare Stations metadata available in BUFR messages vs OSCAR/Surface metadata ( assuming the catalogue being more reliable in case of large differences)

czanna commented 3 years ago

Another set of events we try to monitor are related to changes in BUFR section 1/2/3 keys ( for a given obs.type or GTS AHL) eg tracking occurrences of masterTablesVersionNumber, internationalDataSubCategory, unexpandedDescriptors ecCodes keys. These events can be associated to new data and/or preprocessing issues

EgawaTakumu commented 3 years ago

Sometimes even if we can read the masterTablesVersionNumber or internationalDataSubCategory, we may fail to dump it. For example, we can use the bufr_ls command to get an overview of the bufr data in the GISC-Tokyo 24h cache below (which will be deleted soon), but we cannot dump it with bufr_dump. Since the dump failed and the Satellite_identifier could not be obtained, it is classified in the directory named "Satellite (Empty_or_Invalid)". https://www.wis-jma.go.jp/d/o/DEMS/BUFR/Satellite(Empty_or_Invalid)/Upper_air/20210313/030100/A_IUCN45DEMS130301_C_RJTD_20210313034008_28.bufr In my opinion, the format of BUFR files is complicated so it is difficult to immediately determine if there is a problem with the bufr decoder or the file.

kaiwirt commented 3 years ago

It should also be checked if WSI data uses the correct template and also includes TSI (WMO station number or missing value). An additional check should be if TSI is present, then WSI should be in the range 0-2XXXX-

kurt-hectic commented 3 years ago

"correctness" depends on the type of message. Need to define by type

kaiwirt commented 2 years ago

Closed. Different approach is sensor centers prepare statistics. If they can not read data then this data will be missing in the statistics and thus being reported as missing by the monitoring center