JCSDA-internal / ioda-converters

Various converters for getting obs data in and out of IODA
9 stars 2 forks source link

generic gnssro ncepbufr2ioda converter #1408

Closed hailingz closed 9 months ago

hailingz commented 10 months ago

Description

This is a revised version of the NCEP GNSSRO Bufr2ioda converter after discussion with @neill-b

  1. To make the Bufr2ioda converter more generic, this PR adds two variables, "qualityFlags" and "timeOffset", to metadata.

    • qualityFlags is an integer adding up the 16-digit QC flag reported in the Bufr data. details are referred to the ROM SAF's documentation. It contains a new of flags, such as bending angle non-nominal, refractivity non-monial, descending flag, L2C GPS signal, etc, for sanity checks.
    • timeOffset is the time increment relative to the start time of the occultation in second. That can be used when time interpolation is involved.
  2. Meantime, this PR keeps some basic sanity checks (such as lat/lon range) and removes all others, and removes the obs error specification as well. Both the removed sanity checks and obs error assignment can be done via UFO filters.

  3. "second" of observation time was set to zero before and now is read from file. "dateTime" is consistent with Met Office's ODB data now.

  4. README file in src/gnssro was updated. Simple descriptions were provided for the FIVE GNSSRO converters in this folder. Please edit for the python one if you want @BenjaminRuston

To test, one has to build the ioda-bundle with this branch: feature/gnssro_metoffice_generic

The ctest is: ctest -R gnssro_bufr_generic_conv

Rather you can look at (ncdump) the output file in test/testoutput/gnssro_obs_3prof_generic_2022090100.nc4

NOTE: Neill and I compared the obs accounts for each QC categories by missions. The attached tables compare those numbers among the ODB data (LEFT column), the NC file (MIDDLE) created by running gnssro_bufr2ioda_generic.x and the file (RIGHT) created by running tgnssro_bufr2ioda.x The new generic converters generated identical observations with the ODB file!

On the other hand, "gnssro_bufr2ioda.x", which applies more sanity checks and does not use second report in consistence with GSI, can be probably kept in this repository.

Screenshot 2023-10-30 at 3 41 13 PM Screenshot 2023-10-30 at 3 41 03 PM

Issue(s) addressed

Resolves #1409

Dependencies

N/A

Impact

N/A

Checklist

hailingz commented 10 months ago

"process_center (dataProviderOrigin)" and "occulting_sat_is (instrumentIdentifier)" are not listed in ioda/share//ioda/yaml/validation/ObsSpace.yaml That is why they could not be upgraded with "ioda-upgrade-v2-to-v3.x"

I do not think the old version of GNSSRO Fortran converter is used much so probably no changes need to be done in ObsSpace.yaml. Though a few of ctests still rely on the IODA upgrader but definitely neither "dataProviderOrigin" nor "instrumentIdentifier" is used in any of those tests.

I just make notes here @srherbener

neill-b commented 10 months ago

"process_center (dataProviderOrigin)" and "occulting_sat_is (instrumentIdentifier)" are not listed in ioda/share//ioda/yaml/validation/ObsSpace.yaml That is why they could not be upgraded with "ioda-upgrade-v2-to-v3.x"

I think the naming-conventions sprint looked hard for old variable names in ufo test reference files. So, perhaps process_center and occulting_sat_is were missed because they were in a repo which wasn't looked at. This might mean that there are other files present with unusual variable names.

neill-b commented 9 months ago

Hi @huishao-r - removing the sanity checks was something that I requested. In the ODB reader (as far as possible) we avoid checks, and try and put all the QC into the ufo filters. So, for comparison we'd like the converter to behave in a similar way. The easiest way to achieve this is to create a separate file and remove the additional QC checks from that code. The additional checks could be put as optional with some extra work from @hailingz, but it would make our life easier to avoid that work.

srherbener commented 9 months ago

"process_center (dataProviderOrigin)" and "occulting_sat_is (instrumentIdentifier)" are not listed in ioda/share//ioda/yaml/validation/ObsSpace.yaml That is why they could not be upgraded with "ioda-upgrade-v2-to-v3.x"

I do not think the old version of GNSSRO Fortran converter is used much so probably no changes need to be done in ObsSpace.yaml. Though a few of ctests still rely on the IODA upgrader but definitely neither "dataProviderOrigin" nor "instrumentIdentifier" is used in any of those tests.

I just make notes here @srherbener

@hailingz thank you for the notes. I think in the long run the ObsSpace.yaml file should be updated so that the files can be properly validated. I think can be done in a subsequent PR however so it would be good to create a ZenHub issue for this. Thanks!

huishao-r commented 9 months ago

@neill-b @srherbener Thanks for the comments. This comes into an ongoing discussion on how to reduce the use of obsSpace and memory. Doing it in the converter is not optimal. But I don't know where to put them yet, to be honest.

hailingz commented 9 months ago

oceed with the two converters until the sanity checks, etc. have been ported to UFO filters, then remove the old converter? Would t

Hi @srherbener Thank you for your comments. I agree with you that we can go with two converters now and look for merging them to one once we have more ideas. The main issue is that different centers read/write data differently in terms of the checks. Another thing in this PR is that it reads "SECOND" from the bufr report while the old one following GSI ignores the SECOND by setting it to zero in converter. It might not be scientifically important but can be center dependent.

@huishao-r We may want to talk it?

BenjaminRuston commented 9 months ago

thanks @hailingz and @neill-b we should choose which is the default (NOAA?) and then add a switch to do the changes in your steps 1. and 2. above, the fix to seconds to the dateTime should be kept regardless

but have one routine that can output either style ODB or NOAA

also good time to add a usage clause, so call without arguments and it spits out usage clause rather than having to read the code.. I can help with this.

hailingz commented 9 months ago

a usage clause

Hello @BenjaminRuston Thank you so much for your very useful comments. I have just pushed a couple commits which added (1) "a usage clause", and (2) optional argument so one can choose apply more sanity checks or not. Also descriptions are updated in the top of the codes.
Could you please take a look?

PatNichols commented 9 months ago

@hailingz is it possible to combine the generic and other into one file? In some sense it looks like we are repeating ourselves. thanks

hailingz commented 9 months ago

@hailingz is it possible to combine the generic and other into one file? In some sense it looks like we are repeating ourselves. thanks

Yes @PatNichols The old one was removed and the generic one is renamed to gnssro_bufr2ioda.f90 README is updated.

hailingz commented 9 months ago

Thank you @BenjaminRuston for the "ready to merge" label. Please let me know if anything is still needed so we can get this merged soon. @PatNichols

BenjaminRuston commented 9 months ago

@hailingz I tried this converter the resulting file is much smaller than before these changes

is this expected looks like the gsi refractivity and bending angle errors were removed

BenjaminRuston commented 9 months ago

@hailingz here results through skylab for the new file versus what we have archived using the previous version of the decoder

image

if this is expected or you're happy let's merge this in

hailingz commented 9 months ago

Thank you @BenjaminRuston for the plots. The new plot should be identical to the old one. I have added another check in the "addChecks". Could you please check the results again? Thank you so much!