Closed rsdunlapiv closed 4 years ago
I think that config_machine.xml also need to be cleaned. We need to keep only supported platforms for UFS.
@arunchawla-NOAA @ligiabernardet @GeorgeGayno-NOAA I am trying to run and test the model with different resolutions and i need following information
In this case, i could use chgres to create input for different resolutions.
I do not have this information. I hope others can chime in.
On Thu, Dec 26, 2019 at 2:15 PM Ufuk Turunçoğlu notifications@github.com wrote:
@arunchawla-NOAA https://github.com/arunchawla-NOAA @ligiabernardet https://github.com/ligiabernardet @GeorgeGayno-NOAA https://github.com/GeorgeGayno-NOAA I am trying to run and test the model with different resolutions and i need following information
- Used number of processor for each case. I know C96 uses 150 by default but what about others.
- Namelist changes (input.nml, model_configure, pre- and post- also if there are) based on the resolutions.
In this case, i could use chgres to create input for different resolutions.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/40?email_source=notifications&email_token=AE7WQAQDQ7FCQSIQW5E6AFDQ2UNIFA5CNFSM4J6DMA32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHWE56Y#issuecomment-569134843, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE7WQAUDR4ASIOEQBNLPNL3Q2UNIFANCNFSM4J6DMA3Q .
@ligiabernardet Thanks. I found some information in the following file for C768
https://github.com/NOAA-EMC/fv3gfs/blob/master/scripts/exglobal_fcst_nemsfv3gfs.sh
i am not sure those are valid for the current version of FV3 or not. Anyway, i'll try those options but it would be nice to have information related with the namelist changes (number of io task, dt, other physics options etc.) to test the model with different configurations.
@arunchawla-NOAA @climbfuji @DusanJovic-NOAA @junwang-noaa The model is failed with the options that i found in exglobal_fcst_nemsfv3gfs.sh. It would be great if we have a list of namelist options for different resolutions and CCPP v15p2 and v16beta combinations. Currently, i could not test the model for different resolutions.
@KateFriedman-NOAA can you provide the namelist options that we use (input.nml, model_configure) for the different grid resolutions in the global workflow @GeorgeGayno-NOAA and @WenMeng-NOAA are there namelist options for chgres and UPP that change with resolution? If yes then can you provide examples to @rsdunlapiv @uturuncoglu and @jedwards4b so that they can set it up for CIME
We also need to know how the stochastic options vary with resolution. Tks
On Mon, Jan 6, 2020 at 12:58 PM arun chawla notifications@github.com wrote:
@KateFriedman-NOAA https://github.com/KateFriedman-NOAA can you provide the namelist options that we use (input.nml, model_configure) for the different grid resolutions in the global workflow @GeorgeGayno-NOAA https://github.com/GeorgeGayno-NOAA and @WenMeng-NOAA https://github.com/WenMeng-NOAA are there namelist options for chgres and UPP that change with resolution? If yes then can you provide examples to @rsdunlapiv https://github.com/rsdunlapiv @uturuncoglu https://github.com/uturuncoglu and @jedwards4b https://github.com/jedwards4b so that they can set it up for CIME
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/40?email_source=notifications&email_token=AE7WQAXPWQ2DSOZDDUOK723Q4OEN7A5CNFSM4J6DMA32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIGTEHQ#issuecomment-571290142, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE7WQAVBR72MOWGRKTAP7EDQ4OEN7ANCNFSM4J6DMA3Q .
@ligiabernardet Do we need to keep stochastic seed options constant when we restart the model? How does are handled by the model? What are the option to enable or disable stochastic physics?
Pls direct stochastic questions to @pjpegion, but keep me in the loop so I can write the documentation. I am not familiar with the details of restarting with stochastic physics. Phil's draft documentation is at [https://stochastic-physics.readthedocs.io/en/ufs_public_release/]. My understanding is that it can be disabled with: do_sppt = .F. do_shum = .F. do_skeb = .F. do_sfcperts = .F.
@KateFriedman-NOAA can you provide the namelist options that we use (input.nml, model_configure) for the different grid resolutions in the global workflow @GeorgeGayno-NOAA and @WenMeng-NOAA are there namelist options for chgres and UPP that change with resolution? If yes then can you provide examples to @rsdunlapiv @uturuncoglu and @jedwards4b so that they can set it up for CIME
For chgres, the target or FV3 grid is set by these namelist options:
I believe @KateFriedman-NOAA has provided all required files.
@uturuncoglu there are no specific changes needed for stochastic physics when changing resolutions. If you want a bitewise reproducible restart of forecast that includes stochastic physics, then you need to set FHSTOCH to the desired time (in forecast hours) that you want to write out the stochastic physics restart (There is an update in master that allows for the stochastic physics restart to be written out each time the atmospheric model's restart is written out). This will generate a file stoch_out.F
@KateFriedman-NOAA we still need stable input.nml and model_configure files for each of the supported resolutions and physics combinations.
@KateFriedman-NOAA we still need stable input.nml and model_configure files for each of the supported resolutions and physics combinations.
Questions to help me prep the namelist files: 1) Current operational namelist settings? Or current dev GFSv16 settings? 2) Where to post them? Here?
I will start assembling the ops namelists and adjust if v16 is needed.
Fanglin provided DTC with namelists for GFSv15p2 and GFSv16beta for the C768L64 configuration (those are the supported suites for this release). This is what we have been documenting and testing so far, and what we handed to CIME folks. The main question is whether/how they should be changed with with resolution. https://docs.google.com/document/d/1K-n25HickouGz1wya6b4XeYUzJzQV6EMpEfiPF8Er5w/edit https://docs.google.com/document/d/1qUT2IWmKMa64FRQKV6ut0meAG54HaaTo0nyi1hvMYzg/edit
On Tue, Jan 7, 2020 at 7:42 AM Kate Friedman notifications@github.com wrote:
@KateFriedman-NOAA https://github.com/KateFriedman-NOAA we still need stable input.nml and model_configure files for each of the supported resolutions and physics combinations.
Questions to help me prep the namelist files:
- Current operational namelist settings? Or current dev GFSv16 settings?
- Where to post them? Here?
I will start assembling the ops namelists and adjust if v16 is needed.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/40?email_source=notifications&email_token=AE7WQAVZRTLGQ3QJGTIIJM3Q4SIFRA5CNFSM4J6DMA32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIJCRGQ#issuecomment-571615386, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE7WQAT6EBLFYS2LSYTWVLTQ4SIFRANCNFSM4J6DMA3Q .
Gotcha, I can provide the changes for resolution based on those provided namelists from Fanglin, thanks! I can note them in those two docs if you like (would need edit permissions). I'll collect them separately for now.
I made copies of those two docs (links below) and added values for the variables that change with resolution (as seen in the FV3GFS configs and scripts). Any variable that has a different value based on resolution is labeled like this:
A [B] [C] [D]
...where A is the C768 value (in black), B is the C384 value (in pink), C is the C192 value (in purple), and D is the C96 value (in orange). Also, if a value is easily calculable, I include that calculation in grey.
GFS v15.2 - https://docs.google.com/document/d/1EKc2mAld5VsrNjTRgqUcTVG1ZcEIkllA-NrAKUs4DWI/edit?usp=sharing GFS v16 - https://docs.google.com/document/d/1bLbVdWgEIknDQZgTuOZ6IPVEGv5jUgOrCm4GrR96oBU/edit?usp=sharing
Let me know if additional info is needed.
I could access original documents shared by @ligiabernardet but not the ones of @KateFriedman-NOAA.
@KateFriedman-NOAA Thanks, now i could access them.
Sweet, I have updated the earlier links.
BTW, why layout changes between CCPP versions for same resolution?
(There is an update in master that allows for the stochastic physics restart to be written out each time the atmospheric model's restart is written out).
@pjpegion Is this available in the current version of FV3 ufs_release branch.
Not yet. I can merge it in tomorrow.
Sent from my iPhone
On Jan 8, 2020, at 10:56 AM, Ufuk Turunçoğlu notifications@github.com wrote:
@pjpegion > (There is an update in master that allows for the stochastic physics restart to be written out each time the atmospheric model's restart is written out). Is this available in the current version of FV3 ufs_release branch.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@KateFriedman-NOAA I got mail related with the global-workflow: updates for GFSv15.2.7. If you don't mind could you also put input files for 2020 (i.e. global_co2historicaldata_2020.txt) to the FTP.
@uturuncoglu sure...just the 2020 file? These files were updated in the year-end CO2 file updates:
.../fix_am/co2dat_4a/global_co2historicaldata_2018.txt .../fix_am/co2dat_4a/global_co2historicaldata_2019.txt_proj_u .../fix_am/co2dat_4a/global_co2historicaldata_2020.txt_proj .../fix_am/fix_co2_proj/global_co2historicaldata_2020.txt .../fix_am/fix_co2_update/global_co2historicaldata_2019.txt
I see that Fanglin put the 2014 to 2019 files directly under the fix_am folder...are you expecting the 2020 file there too? There is already the projected file under fix_am.../fix_co2_proj:
fix_am.v20191213/fix_co2_proj/global_co2historicaldata_2020.txt
Yes. I think that i have others. I am getting globalco2historicaldata*.txt files from global/fix/fix_am.v20191213/fix_co2_proj and rest of them are from global/fix/fix_am.v20191213/. If any file updated, please put them to the FTP and we would have a consistent set of files. BTW, i am not sure about the versioning of the folders. Do you need to create another folder with different date that has all the files? If so, i need to change the datestamp in CIME side.
Ok I should have reread my emails from mid-December earlier...these new CO2 files are already in the set on the ftp server and are up-to-date. We got these files in early December and I copied them into our main FIX_DIR set right before Fanglin made that v20191213 set for the UFS release. I did some quick diffs to double check, they are indeed already up-to-date.
@uturuncoglu could you please post an update as to whether the namelist changed for each resolution provided by @KateFriedman-NOAA are working for you in CIME?
@KateFriedman-NOAA there was a question from @uturuncoglu about whether the atmosphere layout should change between v15.2 and v16 versions of physics. Can you please confirm that this should be the case?
@uturuncoglu @rsdunlapiv I am not familiar with CCPP (haven't worked with it yet) so this is a question for Fanglin Yang and Judy Henderson (I can't tag them in here for some reason).
@rsdunlapiv I am still working on CIME side to make restart working properly for regular runs and also test. I could not find time to test other resolution yet but i have already modified namelist XML file and fins suitable layout, write group etc. configuration for Cheyenne which has 36 core in each node.
@uturuncoglu I just merged the stochastic_physics master into ufs_public_release. The stochastic physics random patterns needed for restarting the model should now be written out each restart time. What you need to do at the namelist level is set FHSTOCH to the restart interval.
I will update the submodule pointer to stochastic_physics in my upcoming PR to the ufs_public_release branch of the ufs-weather-model.
@KateFriedman-NOAA @ligiabernardet @pjpegion I tested different resolutions on Cheyenne and i made some changes in the processor count to fit the run on 36 core nodes. The results are follows,
layout | write_groups | write_tasks_per_group | total pe | result | |
---|---|---|---|---|---|
C96 | 4x4 | 1 | 12 | 108 | working |
C192 | 4x6 | 1 | 36 | 180 | working |
C384 | 6x6 | 1 | 36 | 252 | working |
C768 | 12x8 | 3 | 36 | 648 | fails |
C768 | 16x16 | 3 | 36 | 1644 | fails |
I have problem with C768 case and it fails in both test and i the log file i have only
165: calculating slp kr value
176: calculating slp kr value
166: calculating slp kr value
178: calculating slp kr value
177: calculating slp kr value
167: calculating slp kr value
MPT: shepherd terminated: r9i2n33.ib0.cheyenne.ucar.edu - job aborting
All the test are done without threading at this point. I could try to increase number of core more for C768 case but if you have any other suggestion just let me know.
@KateFriedman-NOAA @ligiabernardet @pjpegion Now i am trying to increase IO pool from 3x36 to 7x36. Then, if it fails, i'll double number of processor used.
I could run C768 with threading support. It seems it was related with memory issue. Following configuration works fine for Cheyenne,
layout = 12,8
write_groups = 3
write_tasks_per_group = 36
atmos_nthreads = 2
All four resolutions are now running. C768 requires threading. @uturuncoglu will clean up logic in buildnml to set PE counts based only on the resolution. Error checking needs to be added to ensure that total PE count for the atmosphere is consistent with layout + write task settings in user_nl_ufsatm.
@climbfuji what resolutions are expected to work on a Mac laptop?
On a Mac, I would only want to run C96. I've tried running two C96 setups in parallel, and this drained the resources on my 16GB RAM machine, which makes me assume that C192 won't work. But users owning a Mac Pro (the development power station) will be able to run C192 for sure.
BTW I don't understand why C768 works only with threading turned on on Cheyenne, this seems to be suspicious to me.
I suspect that it runs out of memory, threading reduces the memory required per node. I can run additional tests to confirm.
On Mon, Jan 13, 2020, 16:20 Dom Heinzeller notifications@github.com wrote:
BTW I don't understand why C768 works only with threading turned on on Cheyenne, this seems to be suspicious to me.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-mrweather-app/issues/40?email_source=notifications&email_token=ABOXUGHPZCX7YFMXD3UAOQ3Q5TZLTA5CNFSM4J6DMA32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI2VMLA#issuecomment-573920812, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOXUGHTTYZLL3VOUSCUIQLQ5TZLTANCNFSM4J6DMA3Q .
I suspect that it runs out of memory, threading reduces the memory required per node. I can run additional tests to confirm. …
Got it, this makes sense.
@rsdunlapiv can this ticket be closed?
Done