NOAA-PSL / land-offline_workflow

Creative Commons Zero v1.0 Universal
1 stars 12 forks source link

restored discarded updates: adding new ecbuild-based cmake capability #19

Closed jkbk2004 closed 1 year ago

jkbk2004 commented 1 year ago

PR Instructions:

  1. Provide details under all headings below.
  2. Assign Clara and one other person as reviewers.
  3. If the PR is not ready for merging, add the "DRAFT/DO NOT MERGE" label.
  4. When a PR is ready to merge, remove the "DRAFT/DO NOT MERGE" and email Clara.
  5. Before requesting that the PR be merged, complete the checklist below.

Notes on preparing PR, using git can be found in README_git

Describe your changes

Summarise all code changes included in PR:

List any associated PRs in the submodules.

Issue ticket number and link

List the git Issue that this PR addresses:

Test output

Is this PR expected to pass the DA_IMS_test (ie., does it change the output)?

Does it pass the DA_IMS_test?

If changes to the test results are expected, what are these changes? Provide a link to the output directory when running the test:

Checklist before requesting a review

jkbk2004 commented 1 year ago

@ClaraDraper-NOAA I thought easy solution is to close #17 and open new pr. Can you check this pr?

jkbk2004 commented 1 year ago

@ClaraDraper-NOAA let me test cmake first. sounds like hashes still screwed up on my side.

jkbk2004 commented 1 year ago

@ClaraDraper-NOAA cmake builds ok! I think all hashes are synced up ok. If you confirm, we can start merging in submodules first.

ClaraDraper-NOAA commented 1 year ago

@jkbk2004 I'd like to confirm that the test passes before I begin merging. I'll have a look later today.

ClaraDraper-NOAA commented 1 year ago

@jkbk2004 The test does not pass, and the differences in the output files look very small ( I should double-check this, I am seeing lots of O(10^-8) differences, but have not checked carefully whether there are more substantial differences). I'm assuming that this is due to differences in the compilation flags? I tried compiling one of my routines with the cmake FFLAGS, but that didn't help. How much work would it be for you to change one of the make systems so that they produce zero-diff results? Doing this means I can continue using both systems for a while (since both can use the same test output), but if it's much work I can just switch to CMake.

@barlage This would mean we'd need to add CMake to your offline driver (and also vector2tile). You could leave your current system in place though.

FYI: On hera, I also changed the module file to use the one I'm currently using, since I couldn't get yours to load properly. I also had to change the scripts to point to the new exec locations. You can see my changes here:

/scratch2/BMC/gsienkf/Clara.Draper/gerrit-hera/CSD_workflow/

jkbk2004 commented 1 year ago

@ClaraDraper-NOAA it's a tuning case, right? keep a cycle with make-produced executable and replace one of executables with cmake-produced one. I can give a quick try. How do you validate the result? which file? which script?

ClaraDraper-NOAA commented 1 year ago

@ClaraDraper-NOAA it's a tuning case, right? keep a cycle with make-produced executable and replace one of executables with cmake-produced one. I can give a quick try. How do you validate the result? which file? which script?

There's a script called submit_test.sh (then check_test.sh checks the output). See README for how to use it - there are a few small changes you'll need to make (accounts, etc). It will only work from hera.

Also, I just sent you the wrong directory. The one above is for the current workflow (passes the test), the one at /scratch2/BMC/gsienkf/Clara.Draper/gerrit-hera/jong2 is with CMake.

jkbk2004 commented 1 year ago

@ClaraDraper-NOAA I copied over and ran 01-01 to 01-03 /scratch2/NCEPDEV/marine/Jong.Kim/merge-landa/CSD_workflow /scratch2/NCEPDEV/marine/Jong.Kim/cycle_land/DA_IMS_test//modl/restarts/vector/ufs_land_restart_anal.2016-01-01_18-00-00.nc /scratch2/BMC/gsienkf/Clara.Draper/DA_test_cases/land-offline_workflow/DA_IMS_test/output/modl/restarts/vector//ufs_land_restart_anal.2016-01-01_18-00-00.nc differ: char 1819372, line 11947

ClaraDraper-NOAA commented 1 year ago

Jong, this is the exact issue I'm talking about. I looked at the IMS observation file that is produced by IMS_proc, since this is the first file that is generated in the DA. It has small differences (O(10^-8). If we confirm that there are no significant differences, I'm happy to merge the code anyway - however if we don't resolve the small differences it means I won't be able to keep both compilation methods side-by-side, as they can't both pass the same test. So my question for you was : how hard would it be to remove these small differences? If it's a lot of work, then I'll just switch everything to CMake and remove my current build.sh scripts.

jkbk2004 commented 1 year ago

Jong, this is the exact issue I'm talking about. I looked at the IMS observation file that is produced by IMS_proc, since this is the first file that is generated in the DA. It has small differences (O(10^-8). If we confirm that there are no significant differences, I'm happy to merge the code anyway - however if we don't resolve the small differences it means I won't be able to keep both compilation methods side-by-side, as they can't both pass the same test. So my question for you was : how hard would it be to remove these small differences? If it's a lot of work, then I'll just switch everything to CMake and remove my current build.sh scripts.

@ClaraDraper-NOAA I was a bit confused about OSD_workflow setup. The executables were built with make (not cmake) in that setup, right? Then why the test is not passing? BTW, compile flag options can make difference on result. On weather model side, we just check mean of difference to see any systematic changes over certain areas or something like that. If the behavior is more like random, then we usually accept to update baseline files. In our experience with butterfly test (a least bit change on certain location) and global model, quite significant changes is possible in some regions. That's why we keep checking with mean and stds.

ClaraDraper-NOAA commented 1 year ago

Jong - as I noted a few messages back, the CSD_workflow was the wrong directory. It was built using make. The reason it fails the test is that I changed the FFLAGS (as I mentioned in a message before that) to try to generate the same output as the CMake version, but it didn't work.

This directory contains the CMake version, with the scripts adjusted so that it will run: /scratch2/BMC/gsienkf/Clara.Draper/gerrit-hera/jong2

This fails the test which, as you suggested, is likely because the compilation flags are different. However, my initial (very quick) effort to match the compilation flags (by changing FFLAG in the CSD_* directory) didn't generate the same answer as the CMake version. So my questions is: do you think it's worth trying to change the compilation flags so that we can get the same output from both make and CMake? If this is something that you think you can do relatively easily, then I think we should do it. However, if it winds up being complicated, then I'll just switch my test to use CMake only, and drop the old make method.

jkbk2004 commented 1 year ago

Jong - as I noted a few messages back, the CSD_workflow was the wrong directory. It was built using make. The reason it fails the test is that I changed the FFLAGS (as I mentioned in a message before that) to try to generate the same output as the CMake version, but it didn't work.

This directory contains the CMake version, with the scripts adjusted so that it will run: /scratch2/BMC/gsienkf/Clara.Draper/gerrit-hera/jong2

This fails the test which, as you suggested, is likely because the compilation flags are different. However, my initial (very quick) effort to match the compilation flags (by changing FFLAG in the CSD_* directory) didn't generate the same answer as the CMake version. So my questions is: do you think it's worth trying to change the compilation flags so that we can get the same output from both make and CMake? If this is something that you think you can do relatively easily, then I think we should do it. However, if it winds up being complicated, then I'll just switch my test to use CMake only, and drop the old make method.

@ClaraDraper-NOAA The cmake compile options are from current version of jedi-soca. So, we should be ok to switch over to cmake. With GDASapp transition, there will be more chances to tune up the compile options. In addition, I freshly ran the experiment set with current make build. So, I can take time to play with make/cmake compile comparison. Please, go ahead to merge in the changes you need to switch to cmake build.

ClaraDraper-NOAA commented 1 year ago

Jong - are you suggesting that I merge this, but retain make as the default compile option for now? Then, later, once spack-stack, etc is settled down, we can remove make, and switch the CMake? I'm OK with that plan if you are.

I don't think I've really understood what your plans are for this PR. In the future, I'd like to make sure that we have an issue detailing what is being done in each PR, so that we can make sure we're all on the same page.

Thanks!

jkbk2004 commented 1 year ago

Jong - are you suggesting that I merge this, but retain make as the default compile option for now? Then, later, once spack-stack, etc is settled down, we can remove make, and switch the CMake? I'm OK with that plan if you are.

I don't think I've really understood what your plans are for this PR. In the future, I'd like to make sure that we have an issue detailing what is being done in each PR, so that we can make sure we're all on the same page.

Thanks!

Yes, I originally thought to merge in each feature per PR. cmake -> script update to remove make. I am ok if you add the script update to force the make option on deactivated status or completely remove the make option. Next good option might be to add a ctest feature as well. Then we can save time for validation of test cases.

ClaraDraper-NOAA commented 1 year ago

Jong - I'd like to chat quickly about this. I'm still not clear what your plans are. I just sent you a meeting invite for Friday - feel free to suggest another time if that doesn't suit you.

jkbk2004 commented 1 year ago

Jong - I'd like to chat quickly about this. I'm still not clear what your plans are. I just sent you a meeting invite for Friday - feel free to suggest another time if that doesn't suit you.

I am available to talk now if you like

ClaraDraper-NOAA commented 1 year ago

OK. I'll join the meeting link for our Friday meeting.

On Wed, Oct 19, 2022 at 4:03 PM JONG KIM @.***> wrote:

Jong - I'd like to chat quickly about this. I'm still not clear what your plans are. I just sent you a meeting invite for Friday - feel free to suggest another time if that doesn't suit you.

I am available to talk now if you like

— Reply to this email directly, view it on GitHub https://github.com/NOAA-PSL/land-offline_workflow/pull/19#issuecomment-1284620389, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7BYDYBCW2Z4OGYKXO5MKDWEBV4FANCNFSM6AAAAAARHSNPQ4 . You are receiving this because you were mentioned.Message ID: @.***>

--

Clara Draper (she/her) Physical Scientist NOAA Earth System Research Laboratories Physical Sciences Laboratory 325 Broadway, Boulder, CO DSRC, Rm 1D-105 +1 303 497 4467

jkbk2004 commented 1 year ago

so we need to merge in submodules first: vector2tile, ufs_land_driver, ims_proc, da_update. along with hash updates and reverting gitmodules. merging communication is needed to go sequentially.

ClaraDraper-NOAA commented 1 year ago

I've merged IMS_proc, vector2tile, and the ufs-land-driver. Can you please update your land-DAupdate gitmodule and hash?

jkbk2004 commented 1 year ago

@ClaraDraper-NOAA I will follow up after lunch.

jkbk2004 commented 1 year ago

@ClaraDraper-NOAA you can merge in https://github.com/NOAA-PSL/land-DA_update/pull/9

ClaraDraper-NOAA commented 1 year ago

I've merged land-DAupdate. Can you please update your hash and gitmodules in this repo?

jkbk2004 commented 1 year ago

Here we go. Final check and you can merge here.

ClaraDraper-NOAA commented 1 year ago

Thanks Jong!