Open JingCheng-NOAA opened 5 months ago
The file contains observations: airTemperature, seaSurfaceTemperature, specificHumidity, stationElevation, stationPressure, virtualTemperature, windEastward, windNorthward.
When cropped the data to fit within the model domain, the datatype of 'stationIdentification' @ MetaData group tend out to be 'O'. Not sure if this the correct, because when use 'ncdump' to check the IODA file, it listed as type of 'string'. So manually convert this variable from 'object' into 'string' and write out as NC file. Though the available observation number reduced about 1/3, but the file size increased 3 times due to this variable type change.
First attempt run with the cropped data encountered an error that looking for 'MetaData/stationElevation'. This should be something wrong with JEDI code since the observation NC file does not contain 'MetaData/stationElevation'. 'stationElevation' exists in group 'obsType' and 'obsValue'.
Encountered error in analysis by complaining missing variable "MetaData/stationElevation". The current bufr2ioday.py creates "MetaData/heightOfStation". The python script is modified to change this variable name to "MetaData/stationElevation" so JEDI can recognize the variable. The analysis manage to run through after the change.
Three variables are simulated [stationPressure, airTemperature, specificHumidity]
Though hofx0 and OMBG looks reasonable, the increment is all 0 and it seems like all the observations are marked as missing Value according to the log file. However, the latest test removing all the observation filter still have the same result. None of the observations are used in the analysis.
The 0 increment is due to lack of observation error information. Use gdas issue#82 as a reference to modify the YAML file by adding in and inflating the observation errors.
Start a single observation experiment to test on the observation error for the temperature (obstype: 180)
- filter: Perform Action
filter variables:
- name: airTemperature
where:
- variable: ObsType/airTemperature
is_in: 180
action:
name: assign error
error parameter: 2.5
defer to post: true
ObsValue: 300.4 hofx: 296.3624 omb: 4.037627 oman: 3.947423
The bufr2ioda.py will generate a file with QualityMarker
group of variables. However, it's been tested to prove that to use ObsFunction/ObsErrorFactorConventional
for error inflation, PreQC
is the only variable name that being accepted by ObsErrorFactorConventional function. Thus, the bufr2ioda.py or yaml need to be modified to make sure change the group QualityMarker
into PreQC
Praveen suggest to add following in the YAML file for this QualityMarker
and PreQC
issue temparorly.
obs prior filters:
# Create PreQC group variable
- filter: Variable Assignment
assignments:
- name: PreQC/stationPressure
type: int
source variable: QualityMarker/stationPressure
After several test, the version of obs prior filters
works for the SFCSHP data is following block (and yes, JEDI is looking for MetaData/height
if this prior filter is turned on ):
obs prior filters:
- filter: Variable Assignment
assignments:
- name: PreQC/airTemperature
type: int
source variable: QualityMarker/airTemperature
- name: MetaData/height
type: float
source variable: MetaData/stationElevation
A comparison with GSI showed difference in hofx is ~0.29, 0.1%
Observation error is also off for jedi
Hi @JingCheng-NOAA, I think you are almost there. What obs operator in JEDI is used for this data type? And how is the interpolation done in GSI (e.g., 3dintrp)?
Hi @JingCheng-NOAA, I think you are almost there. What obs operator in JEDI is used for this data type? And how is the interpolation done in GSI (e.g., 3dintrp)?
Hi @delippi , Thanks for checking up.
In JEDI, I use the vertInterp
just as you did for the Mesonet obs.
vertical coordinate: air_pressure
observation vertical coordinate: pressure
observation vertical coordinate group: MetaData
interpolation method: log-linear
in GSI as I checked in the setupt.f90, I believe it is using the tintrp2al for this type (type:180)
Hi @JingCheng-NOAA, sorry for the slow response here. If you are using tintrp2, then I think you should be using the Identity operator in JEDI. The Identity operator itself is creating hofx's which are exactly the same thing as geovals (geovals are the 2D interpolated model values to ob locations). Recall that I only modified the vertInterp to be able to use the GSI-geovals and I wasn't sure how to implement that in the Identity operator. What you could try doing just to be sure is just hard code some of the GSI-geovals into the JEDI-Identity operator.
Make sure you're using the Identity, but I think you will still expect similar differences like you have shown above unless you hard code the GSI-geovals into the Identity operator for this single ob test.
@delippi , I noticed in your demonstration document, you have following section in your vertInterp operator:
filename: "filtered_sfc_tsen_geoval_2022052619.nc4"
levels_are_top_down: False
Is this the part you mentioned that can use the GSI-geovals in JEDI-vertInterp? Sounds like this part requires hard code in JEDI, correct? Can I see how you did it?
@JingCheng-NOAA, I tried putting the changes into a branch on ufo, but it seems I'm not able to do that. You could look under my demo directory that I had made for Hera:
/scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2_demo/RDASApp.20240620/sorc/ufo
to see the modifications that I made.
I followed what Dan did here for gnssro: https://github.com/JCSDA-internal/ufo/compare/feature/feed_gsi_geovals_to_gnssro?expand=1
It wasn't clear to me how this could be done for the Identity operator though. I'm not sure if you will be able to follow what has been done for these two operators and apply it to the Identity operator. My suggestion is to, in a single temperature ob test, put some print statments in the JEDI code RDASApp/sorc/ufo/src/ufo/operators/identity/
and maybe elsewhere to see if you can print out the temperature geovals coming from JEDI. Then you could just hardcode the GSI-geoval for that temperature ob and it should give one-to-one match. If you can show that then you can be certain that you have everything related to the operator correct and the differences are due to differences in GSI vs JEDI geovals (i.e., 4 vs 3 grid point interpolation for geovals).
@JingCheng-NOAA, I tried putting the changes into a branch on ufo, but it seems I'm not able to do that. You could look under my demo directory that I had made for Hera:
/scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2_demo/RDASApp.20240620/sorc/ufo
to see the modifications that I made.I followed what Dan did here for gnssro: https://github.com/JCSDA-internal/ufo/compare/feature/feed_gsi_geovals_to_gnssro?expand=1
It wasn't clear to me how this could be done for the Identity operator though. I'm not sure if you will be able to follow what has been done for these two operators and apply it to the Identity operator. My suggestion is to, in a single temperature ob test, put some print statments in the JEDI code
RDASApp/sorc/ufo/src/ufo/operators/identity/
and maybe elsewhere to see if you can print out the temperature geovals coming from JEDI. Then you could just hardcode the GSI-geoval for that temperature ob and it should give one-to-one match. If you can show that then you can be certain that you have everything related to the operator correct and the differences are due to differences in GSI vs JEDI geovals (i.e., 4 vs 3 grid point interpolation for geovals).
Thank you! I will give it a try.
@JingCheng-NOAA : another alternative and easier approach might be to change the location of the OBS to somewhere small-scale feature may be smaller, e.g., the interior of Atlantic or the center of Gulf of Mexican. At these locations the impact due to the different JEDI/GSI horizontal interpolations would be smaller, so the hofx from JEDI and GSI could be closer.
Switched the observation operator from VertInterp
to Identity
as suggested by @delippi. Before modifying JEDI code to add in the GeoVal, I did launched a test to see if there is any change in the hofx with Identity
obs operator.
It seems like both obs operators are calculating the same hofx.
Another test is made with @HuiLiu-NOAA 's suggestion to move the single observation from near land to the center of Gulf of Mexican. In this test, observation operator is VertInterp
The result seems better, the hofx created by JEDI are similar to what GSI created.
More work need to be done on the observation error, though
@JingCheng-NOAA : nice test. I guess next step is to figure out why the final OBS error is changed. May be helpful to print out the initial OBS error too.
@JingCheng-NOAA : nice test. I guess next step is to figure out why the final OBS error is changed. May be helpful to print out the initial OBS error too.
Thank you! I will start from checking the initial OBS error first.
The initial error assigned in JEDI yaml file was 2.5. Changed it into the same number 0.9729 that GSI was using (read from errtable
). The latest results show a better agreement on final observation error.
@JingCheng-NOAA : excellent result!
Hi @JingCheng-NOAA, thanks for doing these tests. The plot you showed where you switched to Identity
operator over Louisiana I think shows that is the correct operator to use in JEDI and that also aligns with tintrp2*
used in GSI for this data type. I would expect this 0.1% hofx diff due to differences in how JEDI and GSI compute geovals (i.e., there is a slight difference in tintrp2*
and Identity
in the number of grid points used for interpolation).
You then show a figure where you use the vertInterp
again but do it over the middle of the Gulf of Mexico; can you do that same test but with Identity
operator? And did you pass GSI-geovals into the test with vertInterp
? If you did pass GSI-geovals into vertInterp
then the hofxs should match exactly if that is the correct operator which I don't think it is. Don't worry that the Identity
doesn't match exactly (unless you found a way to pass GSI-geovals into it). My worry is that this could be "too simple" of a case and vertInterp
works but won't work in say your Louisiana case. I think generally it is good to do single ob tests in the more complex location to not get false positives for certain settings.
As for the oberror, you should also be able to get this to match exactly. You can hard code some GSI-geovals for model_pressure_sfc[iloc]
and prsl[0][iloc]
and maybe one more level prsl[1][iloc]
(remember that JEDI and GSI have data in reverse order). You can make that change in the following code:
ufo/filters/obsfunctions/ObsErrorFactorPressureCheck.cc
Hi @JingCheng-NOAA, thanks for doing these tests. The plot you showed where you switched to
Identity
operator over Louisiana I think shows that is the correct operator to use in JEDI and that also aligns withtintrp2*
used in GSI for this data type. I would expect this 0.1% hofx diff due to differences in how JEDI and GSI compute geovals (i.e., there is a slight difference intintrp2*
andIdentity
in the number of grid points used for interpolation).You then show a figure where you use the
vertInterp
again but do it over the middle of the Gulf of Mexico; can you do that same test but withIdentity
operator? And did you pass GSI-geovals into the test withvertInterp
? If you did pass GSI-geovals intovertInterp
then the hofxs should match exactly if that is the correct operator which I don't think it is. Don't worry that theIdentity
doesn't match exactly (unless you found a way to pass GSI-geovals into it). My worry is that this could be "too simple" of a case andvertInterp
works but won't work in say your Louisiana case. I think generally it is good to do single ob tests in the more complex location to not get false positives for certain settings.As for the oberror, you should also be able to get this to match exactly. You can hard code some GSI-geovals for
model_pressure_sfc[iloc]
andprsl[0][iloc]
and maybe one more levelprsl[1][iloc]
(remember that JEDI and GSI have data in reverse order). You can make that change in the following code:ufo/filters/obsfunctions/ObsErrorFactorPressureCheck.cc
Thank you Donni for the comments and suggestions. I will do another test with Identity
for obs in middle of Gulf. And no I didn't pass the GSI-geovals into the test with VertInerp
with the obs in new location.
Use operator Identity
for single observation over the middle of the Gulf of Mexico. Result is identical to the one using VertInterp
.
@JingCheng-NOAA : This is interesting. It may suggest that the vertical interpolation operator is just using the lowest level of the model, without any interpolation for the surface observations.
Yeah, and I just realized you DID do the vertInterp
vs Identity
test in over Louisiana. Do you mind pointing me to your run directory?
/scratch1/NCEPDEV/hwrf/scrub/Jing.Cheng/jediwork/test/hafs-data_fv3jedi_2020082512/
The yaml file I used is SFCSHP_singleob_airTemperature_fv3jedi.yaml And the observation I used is in obsout/sfcship_tsen_obs_2020082512.nc4
Thanks for help checking!
Hi @JingCheng-NOAA, I double checked the GSI for which interpolation method is used for obtype 180. Since SFCMODEL=F
, I thought GSI was probably using "SCENARIO 3b" in setupt. It appears to also use 3a when you have a virtual temperature ob (it seems there might be mixed sensible and virtual temperature obs for surface ship?). I confirmed this with print statements in GSI. Since GSI is using 3a/b, it is actually using tintrp31 (not tintrp2*) so vertInterp
is actually the correct observation operator. Then, I was able to feed the GSI-geovals into vertInterp
and got matching results compared to your GSI plots (for the case over the Gulf).
I would say you could try rerunning the case over LA and try feeding the GSI-geovals. Make sure you are feeding geovals like:
gsi geovals:
filename: "obsout/sfcship_tsen_geoval_2020082512.nc4"
levels_are_top_down: False
In your run_fv3jedi.sh
script, you can set the following to just use my version that can feed GSI-geovals into vertInterp
:
jedibin="/scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2_demo/RDASApp.20240620/build/bin"
If you get an error about not having saturation_specific_humidity or surface_geometric_height, you can use this tool:
/scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2/tools/update_geovals.py
after you recreate your GSI-geovals for the LA case.
Hi @JingCheng-NOAA, I double checked the GSI for which interpolation method is used for obtype 180. Since
SFCMODEL=F
, I thought GSI was probably using "SCENARIO 3b" in setupt. It appears to also use 3a when you have a virtual temperature ob (it seems there might be mixed sensible and virtual temperature obs for surface ship?). I confirmed this with print statements in GSI. Since GSI is using 3a/b, it is actually using tintrp31 (not tintrp2*) sovertInterp
is actually the correct observation operator. Then, I was able to feed the GSI-geovals intovertInterp
and got matching results compared to your GSI plots (for the case over the Gulf).I would say you could try rerunning the case over LA and try feeding the GSI-geovals. Make sure you are feeding geovals like:
gsi geovals: filename: "obsout/sfcship_tsen_geoval_2020082512.nc4" levels_are_top_down: False
In your
run_fv3jedi.sh
script, you can set the following to just use my version that can feed GSI-geovals intovertInterp
:jedibin="/scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2_demo/RDASApp.20240620/build/bin"
If you get an error about not having saturation_specific_humidity or surface_geometric_height, you can use this tool:
/scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2/tools/update_geovals.py
after you recreate your GSI-geovals for the LA case.
Thank you so much for checking it up. I will rerun this with your JEDI and see how it goes.
Use GSI GeoVaLs in VertInterp
using the JEDI build by Donnie show the exact same value hofx
Great! I think this is good enough for hofx and oberror validation purposes. The ob errors are close enough and you could show that they would match exactly if you also passed the GSI-geovals into the obserror function too. No need to go down that route.
I'm a little confused by the final increment plot. With everything else identical, JEDI has an order of magnitude larger increment AND has (barely) higher final ob error. The increments should be comparable. This was also something notable in your previous plots for the LA case as well. I wonder what is going on there.
Great! I think this is good enough for hofx and oberror validation purposes. The ob errors are close enough and you could show that they would match exactly if you also passed the GSI-geovals into the obserror function too. No need to go down that route.
I'm a little confused by the final increment plot. With everything else identical, JEDI has an order of magnitude larger increment AND has (barely) higher final ob error. The increments should be comparable. This was also something notable in your previous plots for the LA case as well. I wonder what is going on there.
Thanks for all the valuable advices for the hofx valication. About the increment plot, I was curious too and compared it with my early experiment. I kind of feel that the background covariance we current using cause a large impact on the increment. In this test case I am using 10 GDAS ensemble members (interpolated onto HAFS grids) to provide background covariance. The number of the ensemble member has a large impact on the increments. Below shows the increments difference when use 3, 5, 6 ensemble members from GDAS. You can see the magnitude varies huge. Meanwhile, the bump NICAS files that provided the localization may need more tune as well.
*3 GDAS member
*5 GDAS member
*6 GDAS member
Change test case to 2024 Beryl02L. The observation is placed near the hurricane. Adjust the localization scale in BUMP NICAS to check the impact on increment, and compared with GSI analysis.
Based on the horz: 200km vert:1e2 resolution8, plot lvl from 81 to 79. Comparing with GSI results show the largest increment in GSI is on the second level, not the ground one. @ShunLiu-NOAA mentioned in HAFS DA meeting today about GSI has QC functions to do height correction for surface observations, while JEDI currently doesn't have them. Since this single observation (temperature, 187) I added is on 1000hPa, this could be the reason we saw here as JEDI has larger increment on the 1st level of model while GSI barely has any increment. I will leave this work as it is for now. As more surface level correction functions are included in JEDI, I will revisit this type of observation then.
To record the process of assimilating SFCSHP using JEDI 3DEnVar using HAFS background