Closed kaiwirt closed 2 years ago
An assessment of quality can be the difference between raw data and nwp data
Comment from @EgawaTakumu I think it's important to check the data format, but it's sometimes difficult to follow the new BUFR master table version. We are using JAVA library but it seems to have some bugs....
within the scope of TAC to BUFR migration one of checks that we tried to automate was to compare Stations metadata available in BUFR messages vs OSCAR/Surface metadata ( assuming the catalogue being more reliable in case of large differences)
Another set of events we try to monitor are related to changes in BUFR section 1/2/3 keys ( for a given obs.type or GTS AHL) eg tracking occurrences of masterTablesVersionNumber, internationalDataSubCategory, unexpandedDescriptors ecCodes keys. These events can be associated to new data and/or preprocessing issues
Sometimes even if we can read the masterTablesVersionNumber or internationalDataSubCategory, we may fail to dump it. For example, we can use the bufr_ls command to get an overview of the bufr data in the GISC-Tokyo 24h cache below (which will be deleted soon), but we cannot dump it with bufr_dump. Since the dump failed and the Satellite_identifier could not be obtained, it is classified in the directory named "Satellite (Empty_or_Invalid)". https://www.wis-jma.go.jp/d/o/DEMS/BUFR/Satellite(Empty_or_Invalid)/Upper_air/20210313/030100/A_IUCN45DEMS130301_C_RJTD_20210313034008_28.bufr In my opinion, the format of BUFR files is complicated so it is difficult to immediately determine if there is a problem with the bufr decoder or the file.
It should also be checked if WSI data uses the correct template and also includes TSI (WMO station number or missing value). An additional check should be if TSI is present, then WSI should be in the range 0-2XXXX-
"correctness" depends on the type of message. Need to define by type
Closed. Different approach is sensor centers prepare statistics. If they can not read data then this data will be missing in the statistics and thus being reported as missing by the monitoring center
The monitoring centers need to check the correctness of messages in terms of file format, coding, quality.