Closed xyzemc closed 4 weeks ago
@xyzemc I added a new wiki page detailing additional changes to be made when updating the ctests for a new observation type. Please see here: https://github.com/NOAA-EMC/RDASApp/wiki/Adding-new-observation-types-for-the-RDASApp-ctests
As part of your PR, could you also add the replaceable strings to the yaml, update/run gen_yaml_ctest.sh
, and regenerate the test reference data? Please let me know if you have any issues with this - I would like adding new observation types to the ctests to be easy but this might be an iterative process to make it so.
Another idea would be to let users add their individual yamls files and then we could do a later PR to update the ctests for those new yamls. That could be easier such that only a code manager would need to run/update the gen_yaml_ctest.sh
script. We could also wait until we have a few new yaml types added to update the ctests with. Any thoughts on this @guoqing-noaa or @delippi?
Another idea would be to let users add their individual yamls files and then we could do a later PR to update the ctests for those new yamls. That could be easier such that only a code manager would need to run/update the
gen_yaml_ctest.sh
script. We could also wait until we have a few new yaml types added to update the ctests with. Any thoughts on this @guoqing-noaa or @delippi?
@SamuelDegelia-NOAA, it seems okay to me to do it this way.
@xyzemc You can hold off on what I discussed in my last message - we can just plan to update the ctests at a later time. So it should be okay to just add your yaml as is once @delippi's questions are addressed.
@xyzemc You can hold off on what I discussed in my last message - we can just plan to update the ctests at a later time. So it should be okay to just add your yaml as is once @delippi's questions are addressed.
@SamuelDegelia-NOAA Sounds good to me. Could you please put the data on role account first. The data location on my account is: /scratch2/NCEPDEV/fv3-cam/Xiaoyan.Zhang/noscrub/JEDI/RDASApp_09272024/expr/mpas_2024052700/data/obs/amsua_n19_obs
@SamuelDegelia-NOAA Sounds good to me. Could you please put the data on role account first. The data location on my account is: /scratch2/NCEPDEV/fv3-cam/Xiaoyan.Zhang/noscrub/JEDI/RDASApp_09272024/expr/mpas_2024052700/data/obs/amsua_n19_obs
Done!
@xyzemc, I just have a few questions:
- It seems like you only have MPAS-JEDI results. Have you or do you plan to compare results in FV3-JEDI vs GSI? Both hofx validation where you provide FV3-JEDI with GSI-IODA (this is what I refer to as Phase 1) and making sure all the qc parts work as expected without using GSI-IODA (this is what I refer to as Phase 2). Either way, this is great work and we can definitely add this to the "validated_yamls" even without those tests done yet--since almost all of those yamls are still a work in progress.
- Just a minor detail. The file name of this yaml is
amsua_n19_qc_bc.yaml
. Would it make sense to remove the_qc_bc
part and match the name of the obs space in the yaml?- Another minor consistency detail. For conventional obs, we have been using the naming convention for the ioda data as like
ioda_msonet.nc
--so use something likeioda_amsua_n19.nc
. That shouldn't be too hard to change here, but of course you would have to change the name in the converter step too.- Is the hofx file supposed to have
rass_
prefixed to it?
@delippi Answers for each question are:
@xyzemc You can hold off on what I discussed in my last message - we can just plan to update the ctests at a later time. So it should be okay to just add your yaml as is once @delippi's questions are addressed.
I have done everything that guided in the wiki and addressed @delippi 's question with modification. The new ctest results with the super yaml file is:
`ctest -R rrfs_mpasjedi_2024052700_Ens3Dvar Test project /scratch2/NCEPDEV/fv3-cam/Xiaoyan.Zhang/noscrub/JEDI/RDASApp_09272024/build/rrfs-test Start 3: rrfs_mpasjedi_2024052700_Ens3Dvar 1/1 Test #3: rrfs_mpasjedi_2024052700_Ens3Dvar ... Passed 337.52 sec
100% tests passed, 0 tests failed out of 1
Label Time Summary: mpi = 337.52 secproc (1 test) rdas-bundle = 337.52 secproc (1 test) script = 337.52 sec*proc (1 test)
Total Test time (real) = 338.51 sec `
@xyzemc, I just have a few questions:
- It seems like you only have MPAS-JEDI results. Have you or do you plan to compare results in FV3-JEDI vs GSI? Both hofx validation where you provide FV3-JEDI with GSI-IODA (this is what I refer to as Phase 1) and making sure all the qc parts work as expected without using GSI-IODA (this is what I refer to as Phase 2). Either way, this is great work and we can definitely add this to the "validated_yamls" even without those tests done yet--since almost all of those yamls are still a work in progress.
- Just a minor detail. The file name of this yaml is
amsua_n19_qc_bc.yaml
. Would it make sense to remove the_qc_bc
part and match the name of the obs space in the yaml?- Another minor consistency detail. For conventional obs, we have been using the naming convention for the ioda data as like
ioda_msonet.nc
--so use something likeioda_amsua_n19.nc
. That shouldn't be too hard to change here, but of course you would have to change the name in the converter step too.- Is the hofx file supposed to have
rass_
prefixed to it?@delippi Answers for each question are:
- I haven't compared FV3-JEDI with GSI yet for amsua and atms. But will do that in the near future.
- I have change the yaml file name as you suggested.
- obs name changed to ioda_amsua_n19.nc
- I think 'rass_' prefix is what I copied from GDASApp yaml file. But I can definitely remove the prefix.
@xyzemc Thanks for taking those minor comments into consideration! It looks like you forgot to remove the amsua_n19_qc_bc.yaml
in this PR. There are two yamls in the "Files Changed" tab.
Sorry that I also just noticed the same minor consistency issues with the atms. I should've caught those before. You don't necessarily have to make those changes in this PR. I assume you will have to make some modifications when you do FV3-JEDI vs GSI testing. You can make them there. I just wanted to let you know so you can keep that in mind.
One last thing. I noticed you have a few paths that start with data/obs_ctest
. I just wonder if those should be something else? I'm just thinking about when we get to a cycling system, do we want that data stored in that particular directory name? It looks like you had it named differently recently too. I think how you had it before seems reasonable:
obs bias:
input file: data/obs/amsua_n19.satbias.nc4
output file: data/bc/out_amsua_n19.satbias.nc
@xyzemc, I just have a few questions:
- It seems like you only have MPAS-JEDI results. Have you or do you plan to compare results in FV3-JEDI vs GSI? Both hofx validation where you provide FV3-JEDI with GSI-IODA (this is what I refer to as Phase 1) and making sure all the qc parts work as expected without using GSI-IODA (this is what I refer to as Phase 2). Either way, this is great work and we can definitely add this to the "validated_yamls" even without those tests done yet--since almost all of those yamls are still a work in progress.
- Just a minor detail. The file name of this yaml is
amsua_n19_qc_bc.yaml
. Would it make sense to remove the_qc_bc
part and match the name of the obs space in the yaml?- Another minor consistency detail. For conventional obs, we have been using the naming convention for the ioda data as like
ioda_msonet.nc
--so use something likeioda_amsua_n19.nc
. That shouldn't be too hard to change here, but of course you would have to change the name in the converter step too.- Is the hofx file supposed to have
rass_
prefixed to it?@delippi Answers for each question are:
- I haven't compared FV3-JEDI with GSI yet for amsua and atms. But will do that in the near future.
- I have change the yaml file name as you suggested.
- obs name changed to ioda_amsua_n19.nc
- I think 'rass_' prefix is what I copied from GDASApp yaml file. But I can definitely remove the prefix.
@xyzemc Thanks for taking those minor comments into consideration! It looks like you forgot to remove the
amsua_n19_qc_bc.yaml
in this PR. There are two yamls in the "Files Changed" tab.Sorry that I also just noticed the same minor consistency issues with the atms. I should've caught those before. You don't necessarily have to make those changes in this PR. I assume you will have to make some modifications when you do FV3-JEDI vs GSI testing. You can make them there. I just wanted to let you know so you can keep that in mind.
One last thing. I noticed you have a few paths that start with
data/obs_ctest
. I just wonder if those should be something else? I'm just thinking about when we get to a cycling system, do we want that data stored in that particular directory name? It looks like you had it named differently recently too. I think how you had it before seems reasonable:obs bias: input file: data/obs/amsua_n19.satbias.nc4 ` output file: data/bc/out_amsua_n19.satbias.nc`
@delippi the 'obs_ctest' is copied from atms yaml file that should be setup by @SamuelDegelia-NOAA . But I agree with you
input file: data/obs/amsua_n19.satbias.nc4
is more reasonable.
For the output file: data/bc/out_amsua_n19.satbias.nc
I modified to ' output file: out_amsua_n19.satbias.nc`
removed the data/bc
to keep the output in the same place as hofx
@delippi For the ATMS yaml file, I have noticed the difference from amsua. I plan to change that later because I am working on atms_n20 yaml file right now. When I open PR to merge atms_n20 yaml file, I will modify the atms_npp yaml as well to keep consistency.
When creating the new ctest structure, I split up the observation files in RDAS_DATA to obs/
and obs_ctest/
directories in case some of the obs we use for the ctests were different from those we were using before. Normally this was okay because only the gen_yaml_ctest.sh
script referenced the obs_ctest/
directory to replace the @OBSFILE@ string in the yamls. But because the ATMS/AMSUA data use multiple input observation files (data, bias, lapse rate), some of those paths got hard coded into the yamls. I agree that this isn't an ideal solution given that want these yamls to be flexible and to be used eventually in the workflow.
I think keeping the yaml file general and just referencing an @OBSFILE@ string is ideal. So maybe we should also similar strings for @BIASFILE@ and @LAPSEFILE@, etc. to the yamls in this PR.
When creating the new ctest structure, I split up the observation files in RDAS_DATA to
obs/
andobs_ctest/
directories in case some of the obs we use for the ctests were different from those we were using before. Normally this was okay because only thegen_yaml_ctest.sh
script referenced theobs_ctest/
directory to replace the @OBSFILE@ string in the yamls. But because the ATMS/AMSUA data use multiple input observation files (data, bias, lapse rate), some of those paths got hard coded into the yamls. I agree that this isn't an ideal solution given that want these yamls to be flexible and to be used eventually in the workflow.I think keeping the yaml file general and just referencing an @OBSFILE@ string is ideal. So maybe we should also similar strings for @BIASFILE@ and @LAPSEFILE@, etc. to the yamls in this PR.
Sounds good. I just have one comment. We need to keep those gen_yaml scripts as simple as possible! I think you are, but I just needed to say it! hahaha
I'm trying! But definitely open to any feedback to simplify it more.
Sorry to keep changing things, but could we change all of the data/obs/...
paths to a generic string that can be replaced? See below. This will help us be more generic in these validated_yaml files.
input file: data/obs/amsua_n19.satbias.nc4
-> input file: "@BIASFILE@"
tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt
-> tlapse: "@LAPSEFILE@"
input file: data/obs/amsua_n19.satbias_cov.nc4
-> input file: "@COVFILE@"
Also, could you add an obs localizations
section to the yamls, similar to the file here?
Once those are done then I think this PR will be good to go. We can update the ctests at a later time once we have more yamls ready to go.
Sorry to keep changing things, but could we change all of the
data/obs/...
paths to a generic string that can be replaced? See below. This will help us be more generic in these validated_yaml files.
input file: data/obs/amsua_n19.satbias.nc4
->input file: "@BIASFILE@"
tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt
->tlapse: "@LAPSEFILE@"
input file: data/obs/amsua_n19.satbias_cov.nc4
->input file: "@COVFILE@"
Also, could you add an
obs localizations
section to the yamls, similar to the file here?Once those are done then I think this PR will be good to go. We can update the ctests at a later time once we have more yamls ready to go.
@SamuelDegelia-NOAA the variable &amsua_n19_tlapse in tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt
will be used in the following lines. So I don't think tlapse: "@LAPSEFILE@"
will work.
Sorry to keep changing things, but could we change all of the
data/obs/...
paths to a generic string that can be replaced? See below. This will help us be more generic in these validated_yaml files.input file: data/obs/amsua_n19.satbias.nc4
->input file: "@BIASFILE@"
tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt
->tlapse: "@LAPSEFILE@"
input file: data/obs/amsua_n19.satbias_cov.nc4
->input file: "@COVFILE@"
Also, could you add anobs localizations
section to the yamls, similar to the file here? Once those are done then I think this PR will be good to go. We can update the ctests at a later time once we have more yamls ready to go.@SamuelDegelia-NOAA the variable &amsua_n19_tlapse in
tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt
will be used in the following lines. So I don't thinktlapse: "@LAPSEFILE@"
will work.
Wouldn't it just be
tlapse: &amsua_n19_tlapse "@LAPSEFILE@"
or you could remove the anchor and specify everywhere. Either should work. It just requires running gen_yaml each time. Actually, that is how I do all my tests.
Sorry to keep changing things, but could we change all of the
data/obs/...
paths to a generic string that can be replaced? See below. This will help us be more generic in these validated_yaml files.input file: data/obs/amsua_n19.satbias.nc4
->input file: "@BIASFILE@"
tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt
->tlapse: "@LAPSEFILE@"
input file: data/obs/amsua_n19.satbias_cov.nc4
->input file: "@COVFILE@"
Also, could you add anobs localizations
section to the yamls, similar to the file here? Once those are done then I think this PR will be good to go. We can update the ctests at a later time once we have more yamls ready to go.@SamuelDegelia-NOAA the variable &amsua_n19_tlapse in
tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt
will be used in the following lines. So I don't thinktlapse: "@LAPSEFILE@"
will work.Wouldn't it just be
tlapse: &amsua_n19_tlapse "@LAPSEFILE@"
or you could remove the anchor and specify everywhere. Either should work. It just requires running gen_yaml each time. Actually, that is how I do all my tests.
Ok, let me try tlapse: &amsua_n19_tlapse "@LAPSEFILE@"
input file: data/obs/amsua_n19.satbias.nc4
->input file: "@BIASFILE@"
tlapse: &amsua_n19_tlapse data/obs/amsua_n19.tlapse.txt
->tlapse: "@LAPSEFILE@"
input file: data/obs/amsua_n19.satbias_cov.nc4
->input file: "@COVFILE@"
@SamuelDegelia-NOAA Could you elaborate why you would like to have these changes?
I think it is because we now have obs
and obs_ctest
?
If so, let's remove obs
, rename obs_ctest
to obs
and add the single ADPUPA obs to obs
. It is also expected to combine that standalone single ob test into testinput
(and hence testinput_expr
will not be needed in the future).
I don't think we need that extra flexibility to specify biasfile, lapsefile, covfile on the fly. Sorry that I've been busy preparing for tomorrow's travel and did not join the discussion early.
@guoqing-noaa I would like to keep the yamls added to obtype_config/
as generic as possible. That way we can apply these yamls and associated tools for the workflow at a later date. As such, I think we should limit hardcoding file names in there.
The gen_yaml_ctest.sh
can then be the tool we use to take these generic yamls and merge/convert them for use in the ctests. This choice makes it easier IMO since only that script needs to contain case-specific file names and paths.
@guoqing-noaa I would like to keep the yamls added to
obtype_config/
as generic as possible. That way we can apply these yamls and associated tools for the workflow at a later date. As such, I think we should limit hardcoding file names in there.The
gen_yaml_ctest.sh
can then be the tool we use to take these generic yamls and merge/convert them for use in the ctests. This choice makes it easier IMO since only that script needs to contain case-specific file names and paths.
I have an alternate view. For those files, it would be the best to keep the "run_time" substitutions as less as possible.
Regarding the rrfs-workflow, we will use the giant XML file directly (it will be generated from RDASApp offline).
I have an alternate view. For those files, it would be the best to keep the "run_time" substitutions as less as possible.
Regarding the rrfs-workflow, we will use the giant XML file directly (it will be generated from RDASApp offline).
Even during the workflow, we will need at least two runtime substitutions for the yaml: analysis time and observation distribution (for GETKF observer vs. solver). There might be more that I can't think of.
To me, it makes sense to also make the file names generic and replaceable for the time being until we have settled on standardized file names and input directories. But either solution is fine with me and I don't think it should hold up this PR.
Even during the workflow, we will need at least two runtime substitutions for the yaml: analysis time and observation distribution (for GETKF observer vs. solver). There might be more that I can't think of.
Right, we will need to substitute the analysis time, the obs distribution and maybe a limited sets of others as needed in the future. But we would like to keep the substitutions as minimum as possible to keep the workflow as simple as possible.
To me, it makes sense to also make the file names generic and replaceable for the time being until we have settled on standardized file names and input directories. But either solution is fine with me and I don't think it should hold up this PR.
I did not hold up this PR. But I do think it is better to get things sorted out as early as possible.
I did not hold up this PR. But I do think it is better to get things sorted out as early as possible.
I am fine with changing those lines back if that's what you want.
I did not hold up this PR. But I do think it is better to get things sorted out as early as possible.
I am fine with changing those lines back if that's what you want.
That will be helpful. Thanks, @SamuelDegelia-NOAA
I have replaced the three input pathes and names from 'string' to the actual 'data/obs/filename'. Please re-check and approve it. Thanks!
This is a general comment to the group and we can address/discuss this later, don't have to be in this PR. Should we put all radiance bias correction related files, under data/obs/ or put them on a separate path/folder, such as data/satbias/. This is not very important and either way would be fine, but I would prefer to have a separate folder for satbias related files and not put them all under data/obs. Any thoughts/comments? Thanks!
This PR is for commit the work have been done in issue#180
The ctest has been passed with this new yaml file: