Closed PraveenKumar-NOAA closed 4 months ago
@PraveenKumar-NOAA The Netcdf output file size is a little bit large (2Mb), is there a way you could subsample the input BUFR file to reduce the size of the output? Thanks.
@fcvdb @PatNichols I fixed the file sizes and also ran the unit test, thanks.
$ ctest -R test_iodaconv_bufr_ncep_rtma_prepbufr_adpupa Test project /scratch1/NCEPDEV/da/Praveen.Singh/JEDI/Development-Tests/ioda-bundle_prepbufr/build Start 654: test_iodaconv_bufr_ncep_rtma_prepbufr_adpupa 1/1 Test #654: test_iodaconv_bufr_ncep_rtma_prepbufr_adpupa ... Passed 0.99 sec
100% tests passed, 0 tests failed out of 1
Label Time Summary: iodaconv = 0.99 secproc (1 test) script = 0.99 secproc (1 test)
Total Test time (real) = 1.13 sec
@CoryMartin-NOAA and @emilyhcliu and @PraveenKumar-NOAA this appears to have pressure
in the MetaData
as expected
this is what I would call normal
I tried decoding a prepBUFR directly and this segfaults, but it runs fine when I use the ADPUPA from a split_by_subset
from a prepBUFR. Assume this is expected behavior?
>>> f['MetaData'].keys()
<KeysViewHDF5 ['dateTime', 'height', 'latitude', 'longitude', 'pressure', 'stationElevation', 'stationIdentification']>
and quick dump of some information from a file from a split_by_subset
from a file like prepbufr.gdas.20220216.t00z.nr.48h
this are available via the NCAR RDA.
file: 20220215T210000Z_PT6H_raob_nesdis_ADPUPA_prepbufr.nc4
range(/MetaData/dateTime): [2022-02-15 20:31:48, 2022-02-16 02:54:21]
range(/MetaData/height): [-260.0, 36534.0]
range(/MetaData/latitude): [-90.0, 82.89002227783203]
range(/MetaData/longitude): [0.0, 359.99365234375]
range(/MetaData/pressure): [310.0, 104200.0]
range(/MetaData/stationElevation): [-22.0, 4508.0]
range(/MetaData/stationIdentification): [02527, ZVQEQCM]
range(/ObsError/airTemperature): [0.800000011920929, 1.5]
range(/ObsError/pressure): [110.0, 120.00000762939453]
range(/ObsError/relativeHumidity): [2.0, 2.0]
range(/ObsError/windEastward): [1.399999976158142, 3.200000047683716]
range(/ObsValue/airTemperature): [185.84999084472656, 309.3500061035156]
range(/ObsValue/dewPointTemperature): [163.25, 300.1499938964844]
range(/ObsValue/specificHumidity): [0.0, 0.022273000329732895]
range(/ObsValue/virtualTemperature): [185.84999084472656, 309.3500061035156]
range(/ObsValue/windEastward): [-77.19999694824219, 100.30000305175781]
range(/ObsValue/windNorthward): [-119.5, 133.6999969482422]
range(/QualityMarker/airTemperature): [1, 15]
range(/QualityMarker/height): [1, 15]
range(/QualityMarker/pressure): [1, 15]
range(/QualityMarker/specificHumidity): [2, 15]
range(/QualityMarker/windEastward): [2, 15]
noted that the reference time is also a requirement to change in the yaml
was able to plot the resulting airTemperature
from the IODA observation file:
@BenjaminRuston @PatNichols is there anything I need to take care for this PR?
@PraveenKumar-NOAA It's failing the relevant ctest after merging you last one. Looking into it.
@PraveenKumar-NOAA , @CoryMartin-NOAA and @rmclaren not sure what happened with this PR after:
https://github.com/JCSDA-internal/ioda-converters/pull/1454
was merged. Please let us know if you see anything
@PraveenKumar-NOAA I did have to resolve some conflicts in the branch after the merge or PR1454 very well may have not done that correctly
@PraveenKumar-NOAA @BenjaminRuston it's failing test_iodaconv_bufr_ncep_rtma_prepbufr_adpupa Lots of error that have the form (zero or near zero ) != fill value for float
can someone also explain to me what 3D RTMA is besides regional?
why would a separate decoder be needed versus a standard ADPUPA from split_by_subset
from a prepBUFR file?
The RTMA is our analysis of record. It is used for verification, not for forecasting. However, I don't know why they have ADPUPA specific things compared to RRFS.
658 | 128/334 Test #128: test_iodaconv_bufr_ncep_rtma_prepbufr_adpupa ....................................***Failed 0.36 sec
@PatNichols I don't know why it is failing. Let us talk tomorrow, if you are available.
@PraveenKumar-NOAA These are the errors:PRE-MAIN-INFO BufrParser: Parsing file testinput/rtma_ADPUPA.prepbufr PRE-MAIN-INFO Executing Queries PRE-MAIN-INFO Building Bufr Data PRE-MAIN-INFO Exporting Data PRE-MAIN-INFO Finished [0.039s] DIFFER : VARIABLE : airTemperature : POSITION : [139] : VALUES : 0 <> 3.40282e+38 DIFFER : VARIABLE : airTemperature : POSITION : [140] : VALUES : 0 <> 3.40282e+38 DIFFER : VARIABLE : airTemperature : POSITION : [141] : VALUES : 0 <> 3.40282e+38 DIFFER : VARIABLE : airTemperature : POSITION : [142] : VALUES : 1.80768e-43 <> 3.40282e+38 DIFFER : VARIABLE : airTemperature : POSITION : [143] : VALUES : 0 <> 3.40282e+38 DIFFER : VARIABLE : airTemperature : POSITION : [144] : VALUES : 1.68156e-44 <> 3.40282e+38
This may be bug somewhere else since it looks like missing values are stored as zero instead of the missing value defined.
@PatNichols I might need direct access to your CI servers as I have not been able to replicate this issue anywhere else, Tried HERA, ORION, local mac, etc....
there's still something odd here. There are 3 moisture variables in the file, and then a 4th moisture variable has an error specified without any values:
Variable ObsValue/dewPointTemperature
Variable ObsValue/virtualTemperature
Variable ObsValue/specificHumidity
...
Variable ObsError/relativeHumidity
@PatNichols I might need direct access to your CI servers as I have not been able to replicate this issue anywhere else, Tried HERA, ORION, local mac, etc....
@rmclaren Do you have access to AWS instances?
@PatNichols Unfortunately no.
@PraveenKumar-NOAA I have run the ctests for your PR on a ubuntu AWS instance and I see the same failure so it looks like it is real failure. When I get some breathing room, I will try it on my mac to check further.
@PatNichols thanks, I did a few more checks which include tests using the global prepbufr yaml, original (large version) and trimmed BUFR file sizes. For all of these cases, test_iodaconv_bufr_ncep_rtma_prepbufr_adpupa passed on Hera. It will be better if you also check using your MAC.
@PraveenKumar-NOAA Checked on my mac and it failed. This is with spack-stack-1.6.0 and an intel chip.
@PatNichols thanks, let us talk tomorrow if you are available.
@PraveenKumar-NOAA Are you using the gnu compiler on Hera?
@PraveenKumar-NOAA Are you using the gnu compiler on Hera?
@PatNichols intel compiler.
@PraveenKumar-NOAA Are you using the gnu compiler on Hera?
@PatNichols intel compiler.
It's the gnu and clang compilers that are failing.
Wondering if the fix in PR# 1467 might help with this problem ?? Or is it unrelated...
Wondering if the fix in PR# 1467 might help with this problem ?? Or is it unrelated...
@PraveenKumar-NOAA @rmclaren Look likes that PR has fixed this PR too!
@PraveenKumar-NOAA and @CoryMartin-NOAA and @ADCollard
there's something odd here and didn't initially notice... there are 3 moisture obsValues
and then a 4th moisture value in the obsError
this seems very strange:
Variable ObsValue/dewPointTemperature
Variable ObsValue/virtualTemperature
Variable ObsValue/specificHumidity
...
Variable ObsError/relativeHumidity
@PatNichols @PatNichols all checks have passed now. I will use the smaller file
ObsError/relativeHumidity
@BenjaminRuston thanks. I removed ObsError/relativeHumidity from YAML. But I think it is OK to keep rest of the three moisture ObsValue variables, provide more flexibility to downstream users.
OK,,, yes that makes it a bit clearer as the use will select their moisture variable and then will be responsible for defining the error.
is their any reason this would be different from the ADPUPA file coming from the split_by_subset
of a prepBUFR file... we likely don't want too many redundant pathways to decode this file... and there is the python version of a routine to do this:
it makes sense to me somewhat to have a python API version and a direct yaml version but seems like they should be very close and produce very similar IODA output files.
And to the point. @CoryMartin-NOAA , @ADCollard and @PraveenKumar-NOAA are you opposed to us making any changes to this to allow this converter pathway to produce a file usable by a GDAS proxy app for radiosonde
@BenjaminRuston what sort of changes would you have in mind (I'll be honest I'm only tangentially involved in this converter work)
would like to have a consistent set of ObsValues
for the ADPUPA
file from prepBUFR, or even from the [3D]-RTMA APUPA.
We also would like to have a first set of ObsErrors
currently when we create a radiosonde
file we typically have these values:
range(/MetaData/dateTime): [2024-01-10 02:45:00, 2024-01-10 08:57:24]
range(/MetaData/height): [6.0, 32400.0]
range(/MetaData/latitude): [-9.671710014343262, 71.40470886230469]
range(/MetaData/longitude): [2.13782000541687, 358.20001220703125]
range(/MetaData/pressure): [490.0, 103000.0]
range(/MetaData/releaseTime): [2024-01-10 03:00:00, 2024-01-10 08:00:00]
range(/MetaData/stationElevation): [1.0, 1494.0]
range(/MetaData/stationIdentification): [01001, 97980]
range(/MetaData/variable_names): [air_temperature, virtual_temperature]
range(/ObsError/airTemperature): [0.800000011920929, 1.5]
range(/ObsError/specificHumidity): [1.6827051467771525e-06, 0.0034307106398046017]
range(/ObsError/virtualTemperature): [--, --]
range(/ObsError/windEastward): [1.399999976158142, 3.200000047683716]
range(/ObsError/windNorthward): [1.399999976158142, 3.200000047683716]
range(/ObsType/airTemperature): [120, 221]
range(/ObsType/specificHumidity): [120, 221]
range(/ObsType/virtualTemperature): [120, 221]
range(/ObsType/windEastward): [120, 221]
range(/ObsType/windNorthward): [120, 221]
range(/ObsValue/airTemperature): [186.4499969482422, 289.16314697265625]
range(/ObsValue/specificHumidity): [9.999999974752427e-07, 0.004261000081896782]
range(/ObsValue/windEastward): [-26.200000762939453, 77.0]
range(/ObsValue/windNorthward): [-45.900001525878906, 35.900001525878906]
range(/PreQC/airTemperature): [2, 3]
range(/PreQC/specificHumidity): [2, 115]
range(/PreQC/windEastward): [2, 114]
range(/PreQC/windNorthward): [2, 114]
a notable MetaData
record that is not in this PR is:
/MetaData/releaseTime
also for ObsError
there is a value for windEastward
but not a corresponding value in this PR for:
/ObsError/windNorthward
also we typically use PreQC
as a group while this PR seems to be using QualityMarker
:
/QualityMarker/airTemperature
/QualityMarker/height
/QualityMarker/pressure
/QualityMarker/specificHumidity
/QualityMarker/windEastward
@PraveenKumar-NOAA who is currently using this output file?
@CoryMartin-NOAA that is what I'm thinking... should we either fix these now, or in a subsequent PR with someone on the NOAA side confirming the output file is still something that they find useful and carries what they want
also would prefer fewer moisture variables, specificHumidity
being the top choice which is absent from this PR
@PraveenKumar-NOAA ,,,, OK,,, one last item talked this over with group and there are still some error, or questions rather about the groups and keys in the file.
If you go to the CDASH output for this PR, and look at the iodaconv_validate_testoutput_gdas.t00z.rtma_adpupa.prepbufr.nc
which is here:
https://cdash.jcsda.org/test/6248795
You will find that the output file has these entries that do not follow the current conventions:
ERROR: Variable 'MetaData/obsTimeMinusCycleTime' is not listed in the YAML conventions file.
ERROR: Variable 'MetaData/prepbufrReportType' is not listed in the YAML conventions file.
ERROR: Variable 'MetaData/dumpReportType' is not listed in the YAML conventions file.
ERROR: Variable 'MetaData/prepbufrDataLevelCategory' is not listed in the YAML conventions file.
ERROR: Variable 'QualityMarker/wind' is not listed in the YAML conventions file.
ERROR: Variable 'ObsError/wind' is not listed in the YAML conventions file.
so now understand this may be a specific file used for validation and verification and not for assimilation. Further, you will likely be creating a decoder yaml for the RDAS app separately. As @CoryMartin-NOAA mentioned yesterday we would like what is in here to be generic.
In short, if we want to keep this would make these recommendation:
MetaData/obsTimeMinusCycleTime
could be converted to /MetaData/releaseTime
QualityMarker/wind
and ObsError/wind
could be replicated into ObsError
and QualityMarker
for both windEastward
and windNorthward
so those 2 entries into 4 entriesReportType
s and the DataLevelCategory
again for this PR into ioda-converterif this does move this too far away from usefulness we may want to just close this. This is as this year we do want to enforce the validation standard which will involve cleaning up all the violations currently existing. And we would like to not add additional ctests which add to the technical debt (increase the number of validation errors).
Does it sound OK to clean these for now, and again we can augment as needed, potentially using existing keys such as MetaData/instrumentIdentifier
and others for items if needed.
@PraveenKumar-NOAA ,,,, OK,,, one last item talked this over with group and there are still some error, or questions rather about the groups and keys in the file.
If you go to the CDASH output for this PR, and look at the
iodaconv_validate_testoutput_gdas.t00z.rtma_adpupa.prepbufr.nc
which is here:https://cdash.jcsda.org/test/6248795
You will find that the output file has these entries that do not follow the current conventions:
ERROR: Variable 'MetaData/obsTimeMinusCycleTime' is not listed in the YAML conventions file. ERROR: Variable 'MetaData/prepbufrReportType' is not listed in the YAML conventions file. ERROR: Variable 'MetaData/dumpReportType' is not listed in the YAML conventions file. ERROR: Variable 'MetaData/prepbufrDataLevelCategory' is not listed in the YAML conventions file. ERROR: Variable 'QualityMarker/wind' is not listed in the YAML conventions file. ERROR: Variable 'ObsError/wind' is not listed in the YAML conventions file.
so now understand this may be a specific file used for validation and verification and not for assimilation. Further, you will likely be creating a decoder yaml for the RDAS app separately. As @CoryMartin-NOAA mentioned yesterday we would like what is in here to be generic.
In short, if we want to keep this would make these recommendation:
- believe the
MetaData/obsTimeMinusCycleTime
could be converted to/MetaData/releaseTime
- the
QualityMarker/wind
andObsError/wind
could be replicated intoObsError
andQualityMarker
for bothwindEastward
andwindNorthward
so those 2 entries into 4 entries- for now remove the 2
ReportType
s and theDataLevelCategory
again for this PR into ioda-converterif this does move this too far away from usefulness we may want to just close this. This is as this year we do want to enforce the validation standard which will involve cleaning up all the violations currently existing. And we would like to not add additional ctests which add to the technical debt (increase the number of validation errors).
Does it sound OK to clean these for now, and again we can augment as needed, potentially using existing keys such as
MetaData/instrumentIdentifier
and others for items if needed.
@BenjaminRuston thanks for your comments. I consider all of your suggestions and applied all of them, except MetaData/obsTimeMinusCycleTime has been removed from the YAML. The validation tests results are:
Final results: errors: 0 warnings: 0
awesome thanks Praveen, see that all the validate tests ran 2x am going to retrigger CI but believe this is in good shape many thanks
Description
This PR adds an ioda-converters YAML for the [3D]-RTMA ADPUPA prepbufr data.
The following files are new/ modified:
The ioda-validation test is performed and following results were obtained:
Final results: errors: 0 warnings: 0
The validation of the output ioda obs is also performed in: https://github.com/JCSDA-internal/ioda-converters/issues/1451
Issue(s) addressed
Resolves #<1450> partially.
Dependencies
None
Impact
None
Checklist