Open delippi opened 3 months ago
@delippi Thanks for doing this!
FYI, @spanNOAA submitted an ioda task to rrfs-workflow earlier this year https://github.com/NOAA-EMC/rrfs-workflow/pull/199. Those yaml files may be re-used here: https://github.com/NOAA-EMC/rrfs-workflow/tree/dev-sci/parm/iodaconv
Hofx and oberrors for 130, 131, 134, and 135 (from AIRCFT) are validated. Followed the following documentation for what is assimilated in RAP: https://www.emc.ncep.noaa.gov/mmb/data_processing/prepbufr.doc/table_5.htm
The following tests were completed by feeding the GSI-geovals into vertInterp. There were a few places where the JEDI ob errors don't match the GSI ob errors, but that doesn't seem to be a worry. I didn't look into it beyond what I have shown here, but I would guess it is due to obs being closer to the surface relative to most of the other obs. I have previously shown that ob errors can be different when an ob is below the top 20 or so model levels. This is related to hybrid vcoord and pressure level geoval calculations.
ObType=130 T
ObType=131 T
ObType=134 T
ObType=135 T
ObType=134 q
ObType=230 u
ObType=231 u
ObType=234 u
ObType=235 u
I'm trying to complete hofx and oberror validation for 133/233 (from AIRCAR), but JEDI keeps hanging. I've found that for a few obs, an assertion does not pass in ObsErrorFactorPressureCheck.cc
.
36: Assertion failed: logprsl[0] > logprsl[nlevs-1] in compute, line 299 of /scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2/jedi-assim-aircft/RDASApp.20240620/bundle/ufo/src/ufo/filters/obsfunctions/ObsErrorFactorPressureCheck.cc
I added some print statements before this assertion to print when it is expected to fail and this is the result:
36: iloc =0
36: logprsl[0] =-nan
36: logprsl[nlevs-1] =-nan
36: prsl[0][iloc] =-3.3688e+38
36: prsl[nlevs-1][iloc]=-3.3688e+38
It is also interesting to note that it fails only when iloc=0
, but not every iloc=0
fails. There are at least 7 instances of this in the log file.
@delippi Thanks for doing this!
FYI, @spanNOAA submitted an ioda task to rrfs-workflow earlier this year NOAA-EMC/rrfs-workflow#199. Those yaml files may be re-used here: https://github.com/NOAA-EMC/rrfs-workflow/tree/dev-sci/parm/iodaconv
Hi @guoqing-noaa & @spanNOAA, Thanks for sharing this info. I'm a little confused though. I see only information like XDR, YDR, HRDR for lat, lon, and time metadata in the yaml provided, but in my bufr file (using readmp
which is a bufr tool), I only see XOB, YOB, and DHR that information. And when I try using the provided yaml, all the space/time metadata is missing and results in "no obs assimilated". The DR mnemonics is used for drift like radiosondes and I see something mentioned for aircraft profiles. But the prepbufr obs that I have don't seem like aircraft profiles since they don't have the DR values. Do either of you have any idea about any of this? Thank you!
For reference, here is some output from readmp on AIRCFT (prepbufr).
[hfe01 mpas_2024052700] readmp AIRCFT
message date= 2024052700
MESSAGE TYPE AIRCFT
001194 SID FPY9B ( 8)CCITT IA5 STATION IDENTIFICATION
006240 XOB 348.30000 DEG E LONGITUDE
005002 YOB 55.13000 DEG N LATITUDE
004215 DHR -1.00000 HOURS OBSERVATION TIME MINUS CYCLE TIME
010199 ELV 10973.0 METER STATION ELEVATION
055007 TYP 130.0 CODE TABLE PREPBUFR REPORT TYPE
055008 T29 41.0 CODE TABLE DATA DUMP REPORT TYPE
055009 TSB 0.0 CODE TABLE REPORT SUBTYPE (HAS VARIOUS MEANINGS DEPENDING O
002195 ITP MISSING CODE TABLE INSTRUMENT TYPE
050001 SQN 1164.0 NUMERIC REPORT SEQUENCE NUMBER
050003 PROCN 0.0 NUMERIC PROCESS NUMBER FOR THIS MPI RUN (OBTAINED FROM S
004214 RPT 23.00000 HOURS REPORTED OBSERVATION TIME
004216 TCOR 0.0 CODE TABLE INDICATOR WHETHER OBS. TIME IN "DHR" WAS CORRECT
<RSRD_SEQ> 0 REPLICATIONS
<ACID_SEQ> 0 REPLICATIONS
{PRSLEVLA} 1 REPLICATIONS
004217 RCT 23.03 HOURS RECEIPT TIME
002199 ROLF MISSING CODE TABLE AIRCRAFT ROLL ANGLE FLAG
033026 MSTQ MISSING CODE TABLE MOISTURE QUALITY
010082 IALR MISSING M/S INSTANTANEOUS ALTITUDE RATE
008193 CAT 6.0 CODE TABLE PREPBUFR DATA LEVEL CATEGORY
<P___INFO> 1 REPLICATIONS
[P__EVENT] 1 REPLICATIONS
007245 POB 227.3 MB PRESSURE OBSERVATION
007246 PQM 2.0 CODE TABLE PRESSURE (QUALITY) MARKER
007247 PPC 1.0 CODE TABLE PRESSURE EVENT PROGRAM CODE
007248 PRC MISSING CODE TABLE PRESSURE EVENT REASON CODE
<P__BACKG> 0 REPLICATIONS
<P__POSTP> 0 REPLICATIONS
<Q___INFO> 1 REPLICATIONS
[Q__EVENT] 0 REPLICATIONS
012244 TDO MISSING DEG C DEWPOINT TEMPERATURE OBSERVATION (NOT ASSIMILATE
<Q__BACKG> 1 REPLICATIONS
013250 QOE MISSING PERCENT DIVIDED BY 10 RELATIVE HUMIDITY OBSERVATION ERROR
013249 QFC 15.0 MG/KG FORECAST (BACKGROUND) SPECIFIC HUMIDITY VALUE
<QFC__MSQ> 0 REPLICATIONS
<Q__POSTP> 0 REPLICATIONS
<T___INFO> 1 REPLICATIONS
[T__EVENT] 2 REPLICATIONS
++++++ T__EVENT REPLICATION # 1 ++++++
012245 TOB -45.0 DEG C TEMPERATURE OBSERVATION
012246 TQM 1.0 CODE TABLE TEMPERATURE (QUALITY) MARKER
012247 TPC 15.0 CODE TABLE TEMPERATURE EVENT PROGRAM CODE
012248 TRC 531.0 CODE TABLE TEMPERATURE EVENT REASON CODE
++++++ T__EVENT REPLICATION # 2 ++++++
012245 TOB -45.0 DEG C TEMPERATURE OBSERVATION
012246 TQM 2.0 CODE TABLE TEMPERATURE (QUALITY) MARKER
012247 TPC 1.0 CODE TABLE TEMPERATURE EVENT PROGRAM CODE
012248 TRC MISSING CODE TABLE TEMPERATURE EVENT REASON CODE
012243 TVO MISSING DEG C NON-Q. CONTROLLED VIRTUAL TEMP OBS (NOT ASSIMILA
<T__BACKG> 1 REPLICATIONS
012250 TOE MISSING DEG C TEMPERATURE OBSERVATION ERROR
012249 TFC -46.0 DEG C FORECAST (BACKGROUND) TEMPERATURE VALUE
<TFC__MSQ> 0 REPLICATIONS
<T__POSTP> 0 REPLICATIONS
<Z___INFO> 1 REPLICATIONS
[Z__EVENT] 1 REPLICATIONS
010007 ZOB 10973.0 METER HEIGHT OBSERVATION
010246 ZQM 2.0 CODE TABLE HEIGHT (QUALITY) MARKER
010247 ZPC 1.0 CODE TABLE HEIGHT EVENT PROGRAM CODE
010248 ZRC MISSING CODE TABLE HEIGHT EVENT REASON CODE
<Z__BACKG> 1 REPLICATIONS
010249 ZFC 10902.0 METER FORECAST (BACKGROUND) HEIGHT VALUE
<ZFC__MSQ> 0 REPLICATIONS
<Z__POSTP> 0 REPLICATIONS
<W___INFO> 0 REPLICATIONS
<DRFTINFO> 0 REPLICATIONS
[W1_EVENT] 0 REPLICATIONS
<ACFT_SEQ> 0 REPLICATIONS
<TURB1SEQ> 0 REPLICATIONS
<TURB2SEQ> 0 REPLICATIONS
{TURB3SEQ} 0 REPLICATIONS
{PREWXSEQ} 0 REPLICATIONS
{CLOUDSEQ} 0 REPLICATIONS
{AFIC_SEQ} 0 REPLICATIONS
033249 NRLQMS ......MMM.L (11)CCITT IA5 NRL AIRCRAFT QUALITY CNTRL MARK (ADDED BY PGM PR
<LATCORSQ> 0 REPLICATIONS
<LONCORSQ> 0 REPLICATIONS
>>> END OF SUBSET <<<
@delippi You are right, aircraft observations don't seem like sounding observations. Probably we should use XOB and YOB like you said. I'll further look into it and fix it soon.
Hi @delippi , you can modify the yaml file according to the following figure.
By making this adjustment, you can obtain a converted observation similar to the following.
Thank @delippi for testing and @spanNOAA for fixing it. @spanNOAA made a PR https://github.com/rrfs2/rrfs-workflow/pull/79 to correct it in the workflow.
I'm trying to complete hofx and oberror validation for 133/233 (from AIRCAR), but JEDI keeps hanging. I've found that for a few obs, an assertion does not pass in
ObsErrorFactorPressureCheck.cc
.36: Assertion failed: logprsl[0] > logprsl[nlevs-1] in compute, line 299 of /scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2/jedi-assim-aircft/RDASApp.20240620/bundle/ufo/src/ufo/filters/obsfunctions/ObsErrorFactorPressureCheck.cc
I added some print statements before this assertion to print when it is expected to fail and this is the result:
36: iloc =0 36: logprsl[0] =-nan 36: logprsl[nlevs-1] =-nan 36: prsl[0][iloc] =-3.3688e+38 36: prsl[nlevs-1][iloc]=-3.3688e+38
It is also interesting to note that it fails only when
iloc=0
, but not everyiloc=0
fails. There are at least 7 instances of this in the log file.
I turned off the ObsErrorFactorPressureCheck which allows JEDI to run through without failing. Then I checked the geovals computed from JEDI from processor 36 and iloc=0,1 (figure below). Again this issue is only appearing when iloc=0 and only for a few instances of iloc=0. The problem exists for all geoval variables for this processor at iloc=0; in this case, that is both air_temperature
and air_pressure
. I don't see anything wrong with the IODA observation file. Could this be an indexing issue when assigning obs to processors?
I've made some progress on the missing value issue. First, in https://github.com/JCSDA-internal/ufo/issues/3422 I showed that when changing the observation distribution that that iloc=0
was just a coincidence. The problem was only occurring within the first 80 obs (out of some 23,000) and ob 1 was going on processor 1 as the first ob, ob 2 on processor 2 as the first ob, etc. And with only 80 processors is why iloc=0
seemed to have been the problem.
The problem seems to be with the observation location. I did a combination of python and @CoryMartin-NOAA's neat "ncdump to text, edit, and ncgen" method. Here is what I did, I wrote some python to select an arbitrary list of ob indices from the GSI-IODA (and GSI-geovals) and save them to a new file. I selected ob numbers 6 and 7. (I chose these numbers because iloc=6
was know to produce the missing values and iloc=7
was known to not have this problem; I reconfirmed this when running only these two obs). Then I started modifying the ncdump text file value by value and narrowed it down to the observation longitude.
Here is the original lon values for 6 and 7 respectively:
longitude = 235.157, 235.468 ;
Simply changing the first to match the second works now!
These obs all come from the GSI-IODA so they shouldn't be outside the domain. Is there a domain check that I should try to use? Any ideas would be appreciated! @ShunLiu-NOAA, @SamuelDegelia-NOAA, @guoqing-noaa, @HuiLiu-NOAA, @xyzemc, @TingLei-NOAA
Perhaps they are outside of the domain according to JEDI? Are they close to the edge?
Thanks @CoryMartin-NOAA, yep! looks like it is outside the domain for JEDI. The red dot is the original lat/lon. The gray shaded is the domain.
Nice find @delippi! Does the background check filter work for removing these obs near the boundary? Or does the Assertion failed
error occur before that filter can go through?
@SamuelDegelia-NOAA would it be possible that you extend your offline LETKF domain check tool to be more general so that we can use it to remove all observations outside our DA domain? JEDI has been working on adding a model domain check filter but we don't know when that will be ready to use.
@SamuelDegelia-NOAA, the background check doesn't seem to help. Can you please share your domain check tool? That is probably the best way to go, for now.
@delippi and @guoqing-noaa I do not think the simple domain check I have been using for LETKF would work here. It only takes the max/min coordinates of the domain and throws out anything outside that. So the check is mainly used to throw out obs very far away that would not be included in any halo regions. Since the minimum domain longitude would be smaller than this plotted ob (since the domain extends further west), it would not end up throwing this ob away.
But this does suggest the usage for a smarter offline domain check. I will see if there is anything I can come up with using python.
if the H(x) is a missing value, is it not assimilated? If that is the case, then what is the harm of having these here? What we need (if we don't already) is have a reducedObsSpace that is the result of only non-missing H(x) values.
@CoryMartin-NOAA, right, if the H(x) is missing it is not assimilated, but the problem is when I have ObsErrorFactorPressureCheck (for oberror inflation) turned on, the geovals that that function uses are missing which gets caught in
Assertion failed: logprsl[0] > logprsl[nlevs-1]
I agree, it seems like a reducedObsSpace would be what we want. What is the status on that?
Reduced obs space works, but I'm not sure if it works in this case. Regardless, sounds more like the ObsErrorFactorPressureCheck should check for missing values and return if that location is missing a value.
@CoryMartin-NOAA Something I have been wondering for a little bit - do you know how or if the filter order matters? As in, would having the background check filter first (with the ReduceObsSpace filter action) have any impact?
The only order that matters is pre
, prior
, post
. Other than that it's the order it is read in (I think). https://jointcenterforsatellitedataassimilation-jedi-docs.readthedocs-hosted.com/en/latest/inside/jedi-components/ufo/qcfilters/introduction.html#order-of-filter-application
Okay, thank you very much for sharing. Will bookmark this page.
@CoryMartin-NOAA, do you have any examples of the reducedObsSpace function? I want to try that first. Then I will see about adding a check in ObsErrorFactorPressureCheck.
@CoryMartin-NOAA, do you have any examples of the reducedObsSpace function? I want to try that first. Then I will see about adding a check in ObsErrorFactorPressureCheck.
I'll have to defer to @emilyhcliu or @TravisElless-NOAA on this one
Here is an example of using reduce obs space with the gaussian thinning filter. I guess the idea would be to try using this filter action with the background check?
@SamuelDegelia-NOAA I don't think that will currently work, but it's worth a shot. I think it only will do a reduced obs space before GeoVaLs (at the moment)
Ah you're right @CoryMartin-NOAA, it can only be used for pre-filters
Perhaps the LAMDomainCheck could be used for your FV3-LAM tests with the reduce obs space filter action. An example of that is shown for the Bounds Check filter in RDASApp/sorc/fv3-jedi/test/testinput/3dvar_lam_cmaq.yaml
.
I wrote my own offline domain check to test the data. It seems to have been a success. The domain check also seems to toss a little more than I was expecting, but might not necessarily be wrong. Original ob counts were closer to 23,000! Nonetheless, after using this domain check the results are as expected. I assume the slight oberror differences are simply due to GSI vs JEDI geovals and are not of concern to me.
@CoryMartin-NOAA, right, if the H(x) is missing it is not assimilated, but the problem is when I have ObsErrorFactorPressureCheck (for oberror inflation) turned on, the geovals that that function uses are missing which gets caught in
Assertion failed: logprsl[0] > logprsl[nlevs-1]
I agree, it seems like a reducedObsSpace would be what we want. What is the status on that?
The reduce obs space
capability is already in UFO develop.
It can only be used with pre filters.
Add reduce obs space
under action for a pre filter.
Example:
I use the following filter for check if the obs is outside of the MPAS domain:
Regional Domain Check
- filter: Bounds Check fileter variables:
- name: brightnessTemperature test variables:
- name: LAMDomainCheck@ObsFunction options: map_projection: circle save: true cenlat: 38.5 cenlon: -97.5 radius: 2500 #km minvalue: 1.0
But this is a temporary solution before Hui's code work.
Finished validation of obtype *33. The problem was observations outside of the domain (even when using GSI-IODA obs). For now, I'm just using an offline domain check filter since I haven't been able to get any of the suggestions above to work yet.
The domain check creates a convex hull (red box) approximating the domain (blue) to check for points within. It only takes about 15 seconds for 23k obs. More complex shapes could be achieved, but would require python tools that are not currently installed (newer version of shapely, for example).
Here are the remaining *33 validation plots. The final ob errors seem to not match as well as other aircraft types, however, there are many more obs and so also many more obs at lower levels which is known to produce differences due to how geovals are created between GSI and JEDI. Only GSI-geovals are fed into vertInterp in these tests, the ob error inflation function still uses JEDI-geovals.
@guoqing-noaa, yes it was very straight forward to adapt my domain check to MPAS grid. The blue box is the actual grid lat/lons where the red convex hull is an approximation of the blue box and meant to be slightly within the blue box (and can be tuned). @SamuelDegelia-NOAA also has an offline domain check that we should compare with. It also seems there are others working on an offline domain check too? This is just a simple solution and should at least get the job done for testing purposes, at least.
@delippi Super!
Yes, it will be good if we can combine those offline tools.
Just curious about the difference between your plotted domain with the one plotted by @spanNOAA (is it because of the map projection?):
@guoqing-noaa, thanks for showing @spanNOAA's plot. It could be map projection its hard to tell. With cartopy, I used projection=ccrs.PlateCarree(central_longitude=0)
. Do you happen to know what was used for this plot? I can try to reproduce it with that projection on confirm.
@guoqing-noaa, I think I got it. It seems like projection=ccrs.LambertConformal()
was used. Here is the result with that. Differences are just map projection although some of the boundary extents look a little different. The way I have some thing plotted look strange with this projection.
Edit: I double checked the min/max lat/lons and added grid lines to my plot.
Max/Min Lat: 61.212459564208984, 19.301267623901367
Max/Min Lon: -48.77972412109375, -146.12791442871094
This seems unusual to me. I'm also using the Lambert projection and am not seeing any blank areas within the domain. Which plotting function did you use?
@spanNOAA, don't worry about the blank areas. That was because I was plotting the domain as bunch of lines instead of filling it. I've since fixed it, but still have a strange angle along southern boundary. Possibly an artifact in how I'm plotting, but I'm not sure.
I'm more concerned about the extent of the domain edges. What you have plotted looks more like the FV3-JEDI case. Here is an MPAS and FV3 examples.
@delippi For the MPAS domain, the north line is much smooth than the south line. Is it possible to make the south line smooth as the north line? I am also curious about why we don't let the domain boundary overlaps with the concave boundary?
The plot I created shows an MPAS domain generated by Chunhua. The unusual angle along the southern boundary might be related to the plotting method you used? I used a dot plot to visualize the domain.
@delippi For the MPAS domain, the north line is much smooth than the south line. Is it possible to make the south line smooth as the north line?
I'm not sure at the moment why it is doing that. I will look into it more. Probably this:
The unusual angle along the southern boundary might be related to the plotting method you used?
I am also curious about why we don't let the domain boundary overlaps with the concave boundary?
If I understand the question correctly, the best way (I think) to check if an ob is within a domain is to use a hull (the red box) which I've seen explained as "A Convex Hull is the smallest convex shape (or polygon) that can enclose a set of points in a plane (or in higher dimensions). Imagine stretching a rubber band around the outermost points in a set; the shape that the rubber band forms is the convex hull." So, if there are any concave points between vertices, then there would be whitespace between the red and blue box. I've shrunk the convex hull such that there wouldn't be such whitespace which, of course, in tern means that it is going to be not an exact match of the domain grid. I believe there are more sophisticated ways to get a more accurate concave hull than this, but I think this was the best we could do with the python modules currently available. I think @SamuelDegelia-NOAA's tools uses concave_hull but he had to load his own conda environment to get to work.
@HuiLiu-NOAA Does your PR https://github.com/JCSDA-internal/mpas-jedi/pull/976 eliminate the need to use an offline regional domain check?
I think the mpas-jedi team is in a better position to answer this question.
Great work on the domain check @delippi. If your method can use standard python modules loaded in RDASApp then it is likely a better route than mine. Is it easy to adjust how close the red hull is to the model domain edges? If so it might be worth trying to see how close we can get it without getting any bad obs through
While the domain check is/should be in JEDI already, I still think its good to have this offline tool available.
@SamuelDegelia-NOAA, thanks. Yes, my version uses only the python modules available in RDASApp. The actual checking for obs in/out of the domain is really quick. Like 1s for the 500k mesonet ob, but it still takes about 7.5 min to copy the sliced data. I believe yours might have been quicker? We should compare and maybe use your method for that part. It is very easy to shrink the red hull edges and get as close as possible to the edges without letting any bad obs in. Right now I just have it scaled down to shrink by 4% which was me eyeballing it. When I switched the plot to the LambertConformal projection it appears to be contained further in the blue domain edges. I think the LC projection is a little misleading as there is clearly green dot outside of the red boundary (look at the west edge of the last image shared). I think this is a more accurate image. This image uses the 4% shrink so the east and west concave blue edges touch the red edges.
@delippi, that is great info. I agree that this tool will be useful to have.
My domain check for the ~500k mesonet obs takes 1) 48 sec to check if each ob is in the domain, and 2) 3 sec to write out the new file. The writing is done by creating a new netcdf file and then copying each group/variable into the new file but only selecting the indices for obs that are inside the domain. If we want to incorporate my part from (2) into your domain check, then you could look at /scratch1/NCEPDEV/da/Samuel.Degelia/jedi_tools/offline_domaincheck_flex.py
starting at line 60.
@SamuelDegelia-NOAA, I got them combined! For the 500k mesonet obfile it took: 3 s to find obs within domain, 5 s to write out file, 33s to create plot. Run times vary a little bit each time its run. Pretty nice! Thanks for sharing your method for creating the new obfile! It still needs a little cleaned up, but it works great!
/scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2/tools/filter_domain_obs.py
Run like:
module use /scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2_demo/RDASApp.20240620/modulefiles
module load EVA/hera
python filter_domain_obs.py <obs_filename>
Note that I have a custom time function that you also need in the run directory or PYTHONPATH or just remove those lines.
This is great, glad things are working so quickly. This will be a great tool to have!
hi, @delippi : Would this offline python tool generic enough to work for the HAFS domain as well? If so, it might help for the AMV test case by @XuLu-NOAA, @JingCheng-NOAA for the moment.
Phase 2 Validation (no reliance on any GSI-derived IODA or GeoVaLs) initial results.
Methodology:
Run bufr2ioda:
Run GSI:
Run FV3-JEDI:
Summary: For the most part, the increments and observation counts look reasonable! I do need to figure out why the uv_233 plot has such large increments. More to look into there. Overall, not bad for initial testing and almost no modifications from what is currently in RDASApp!
The UFO in JEDI is the component that not only computes model-simulated observations, but also houses filters and methods for observation QC, ob errors, and bias correction. The GSI observer is the equivalent. Many forward operators for various observations have been developed for the UFO. These operators can be utilized in RDAS. However these operators still must be tested for RDAS. The steps for transition by observation platform are as follows:
Aircraft data is bufr obtype=30, 31, 33-35