COSIMA / 01deg_jra55_iaf

Deprecated 0.1 degree ACCESS-OM2-01 configurations using JRA55-do IAF forcing.
3 stars 4 forks source link

cycle4 #11

Open hakaseh opened 3 years ago

hakaseh commented 3 years ago

@aekiss After setting and testing master+bgc, I'm thinking to create a branch cycle4 for our upcoming cycle 4 run. let me know if you have a better branch name in mind.

For this, I'd like to create diag_table. can you locate me to diag_table_source.yaml that has all the physical diagnostics for cycle 4? I can add BGC diagnostics to it.

Also do you have any recommendation for the duration of 0.1-deg run for one job? For physics only, it seems walltime is set to 5:00:00 for 3 months? If it's good to keep 3 months at a time, and BGC requires approximately nearly doubling, I can try 10:00:00 for 3 months?

hakaseh commented 3 years ago

I just noticed that there is already a branch for Cycle 4, https://github.com/COSIMA/01deg_jra55_iaf/tree/01deg_jra55v140_iaf_cycle4.

Should I pull this and start adding BGC diagnostics and all other bgc settings?

aekiss commented 3 years ago

Yes, https://github.com/COSIMA/01deg_jra55_iaf/tree/01deg_jra55v140_iaf_cycle4 is the branch intended for this purpose.

diag_table_source.yaml in that branch has the physical diagnostics from cycle3 so that is a good starting point for you to add BGC diags to.

There's a 5-hour walltime limit with this number of cores https://opus.nci.org.au/display/Help/Queue+Limits so I think we'll need to use 1mo runs, at least to start with.

aekiss commented 3 years ago

FYI the actual walltime for cycles 2 and 3 is in these tables https://github.com/COSIMA/01deg_jra55_iaf/blob/01deg_jra55v140_iaf_cycle2/run_summary_home_156_aek156_payu_01deg_jra55v140_iaf_cycle2.csv https://github.com/COSIMA/01deg_jra55_iaf/blob/01deg_jra55v140_iaf_cycle3/run_summary_home_156_aek156_payu_01deg_jra55v140_iaf_cycle3.csv it's under 3hr on average but gets close to 4hr when we need to reduce the timestep.

aekiss commented 3 years ago

If we have 1mo runs we'll need to set new_file_freq: 1 in the global section of diag_table_source.yaml

hakaseh commented 3 years ago

my test runs with master+bgc took just under 3 hours for 1 month simulation. but cycle4 will take longer because of substantially larger diagnostics.

aekiss commented 3 years ago

Wow, so 3x slower... What timestep did you use?

hakaseh commented 3 years ago

300s, i did not change from the default. should i try increase?

aekiss commented 3 years ago

I guess you were starting from rest? If so, 300s is a good initial choice. 540s will probably only work after ~1.5yr, as it needs time to settle down from the IC. See timesteps in first cycle: https://github.com/COSIMA/01deg_jra55_iaf/blob/01deg_jra55v140_iaf/run_summary_home_156_aek156_payu_01deg_jra55v140_iaf.csv

So that makes it more like 1.7x slower if we take timestep into account. How many cores are you using?

hakaseh commented 3 years ago

yes, from rest, using the same number of cores as master of https://github.com/COSIMA/01deg_jra55_ryf

hakaseh commented 3 years ago

I haven't sorted out the walltime for Cycle 4 yet, but I have created an BGC-enabled branch for Cycle 4 in my fork repo: https://github.com/hakaseh/01deg_jra55_iaf/tree/01deg_jra55v140_iaf_cycle4_bgc This should have all the changes needed to enable BGC runs. The next step is to set up restart files for this run, which requires manually adding sea-ice BGC tracers to sea-ice restart file, which I have tested with 1-degree in the past.

Will this be the restart directory used for the start of cycle 4?: /g/data/ik11/outputs/access-om2-01/01deg_jra55v140_iaf_cycle3/restart731

aekiss commented 3 years ago

I've just merged master+bgc into 01deg_jra55v140_iaf_cycle4, so your https://github.com/hakaseh/01deg_jra55_iaf/tree/01deg_jra55v140_iaf_cycle4_bgc could be merged into 01deg_jra55v140_iaf_cycle4 via a pull request when we're ready.

Yes, /g/data/ik11/outputs/access-om2-01/01deg_jra55v140_iaf_cycle3/restart731 is the restart we'll start from. See https://github.com/COSIMA/01deg_jra55_iaf/blob/01deg_jra55v140_iaf_cycle3/run_summary_home_156_aek156_payu_01deg_jra55v140_iaf_cycle3.csv

metadata.yaml will need to be updated to document the BGC initial conditions used.

hakaseh commented 3 years ago

I actually created https://github.com/hakaseh/01deg_jra55_iaf/tree/01deg_jra55v140_iaf_cycle4_bgc so that it can be merged into 01deg_jra55v140_iaf_cycle4 without conflicts. There will be conflicts in BGC setup between master+bgc and cycle4 for example diag_table.

I can do pull request but is it possible to try merge with the version of 01deg_jra55v140_iaf_cycle4 before merging with master+bgc? I think this should be easier than resolving conflicts between master+bgc and https://github.com/hakaseh/01deg_jra55_iaf/tree/01deg_jra55v140_iaf_cycle4_bgc.

I have updated metadata.yaml to include BGC initial conditions in https://github.com/hakaseh/01deg_jra55_iaf/tree/01deg_jra55v140_iaf_cycle4_bgc.

hakaseh commented 3 years ago

Regarding the restart files for sea-ice BGC tracers, I added them and created the following new files based on /g/data/ik11/outputs/access-om2-01/01deg_jra55v140_iaf_cycle3/restart731/ice/:

iced.2019-01-01-00000.nc, i2o.nc, o2i.nc.

The shell script to create these files are below (I couldn't attach *.sh here). Should I add this script to input_om2-bgc or create a separate github repo?

#Add sea-ice BGC tracers to the restart file

#Define the path to the restart file you want to add BGC tracers
path2restart=/g/data/ik11/outputs/access-om2-01/01deg_jra55v140_iaf_cycle3/restart731/ice

#Define the name of the restart file
filename=iced.2019-01-01-00000.nc

#Copy the restart file to pwd
cp ${path2restart}/${filename} .

#Add 2D fields.
for j in algalN nit
do
    ncap2 -O -s ${j}=iceumask*0 ${filename} ${filename}
done

#Add 3D fields.
for j in bgc_N_sk bgc_Nit_sk
do
    ncap2 -O -s ${j}=aicen*0 ${filename} ${filename}
done

#Next i2o.nc and o2i.nc

filename=i2o.nc

#Copy the restart file to pwd
cp ${path2restart}/${filename} .

for j in wnd10_io nit_io alg_io
do
    ncap2 -O -s ${j}=licefh_io*0 ${filename} ${filename}
done

filename=o2i.nc

#Copy the restart file to pwd
cp ${path2restart}/${filename} .

for j in ssn_i ssalg_i
do
    ncap2 -O -s ${j}=sst_i*0 ${filename} ${filename}
done
aekiss commented 3 years ago

Re your previous post, I've already resolved the conflicts between master+bgc and 01deg_jra55v140_iaf_cycle4 when I merged them: https://github.com/COSIMA/01deg_jra55_iaf/network

The differences between 01deg_jra55v140_iaf_cycle4 and 01deg_jra55v140_iaf_cycle4_bgc are not that big: https://github.com/COSIMA/01deg_jra55_iaf/compare/01deg_jra55v140_iaf_cycle4..hakaseh:01deg_jra55v140_iaf_cycle4_bgc but there are a few conflicts that would need to be manually resolved.

You could try this in your fork by pulling the latest 01deg_jra55v140_iaf_cycle4 and either merging 01deg_jra55v140_iaf_cycle4 into 01deg_jra55v140_iaf_cycle4_bgc or vice-versa. Have you resolved merge conflicts before? I could do it if you prefer.

aekiss commented 3 years ago

I think that script should be added to input_om2-bgc

hakaseh commented 3 years ago

Re your previous post, I've already resolved the conflicts between master+bgc and 01deg_jra55v140_iaf_cycle4 when I merged them: https://github.com/COSIMA/01deg_jra55_iaf/network

The differences between 01deg_jra55v140_iaf_cycle4 and 01deg_jra55v140_iaf_cycle4_bgc are not that big: https://github.com/COSIMA/01deg_jra55_iaf/compare/01deg_jra55v140_iaf_cycle4..hakaseh:01deg_jra55v140_iaf_cycle4_bgc but there are a few conflicts that would need to be manually resolved.

You could try this in your fork by pulling the latest 01deg_jra55v140_iaf_cycle4 and either merging 01deg_jra55v140_iaf_cycle4 into 01deg_jra55v140_iaf_cycle4_bgc or vice-versa. Have you resolved merge conflicts before? I could do it if you prefer.

I'm still not sure why we need to merge master+bgc into 01deg_jra55v140_iaf_cycle4 (and resolve conflicts) before merging 01deg_jra55v140_iaf_cycle4_bgc. Is it because so that master+bgc will recognise 01deg_jra55v140_iaf_cycle4 and vice versa? Still learning and trying to understand the github workflow.

I have resolved the conflicts and pushed the commit. will pull request now.

aekiss commented 3 years ago

No particular reason, it's just that I'd already merged master+bgc into 01deg_jra55v140_iaf_cycle4 before I realised you had a 01deg_jra55v140_iaf_cycle4_bgc branch.

hakaseh commented 3 years ago

I think that script should be added to input_om2-bgc

I just added the script to https://github.com/COSIMA/input_om2-bgc. Could you test the script?

aekiss commented 3 years ago

It worked for me. I've just pushed up a few tweaks. finalise.sh now adds provenance info to the warm start files too.

hakaseh commented 3 years ago

great!

what's the next step? we need to create /g/data/ik11/inputs/access-om2/input_bgc_20211105/01deg-cycle4 and add restarts and we are ready for testing??

aekiss commented 3 years ago

Yes I think this is getting close! Did you want me to make the cycle 4 inputs?

Also, just to double-check (sorry, I probably already asked about most of these): did/should we resolve/do any of these?

hakaseh commented 3 years ago

yes, could you please make the cycle 4 input? (not so familiar with the finalise.sh step, and since you made /g/data/ik11/inputs/access-om2/input_bgc_20211105/1deg, for consistency. Note that it's only csiro_bgc.res.nc in the cycle 4 input that is different from /g/data/ik11/inputs/access-om2/input_bgc_20211105/1deg.

hakaseh commented 3 years ago

Have a nice weekend 👋

aekiss commented 3 years ago

Thanks, you too - hope your weather there is better than here! I'm making csiro_bgc.res.nc for cycle 4 now.

aekiss commented 3 years ago

OK those initial conditions are now in /g/data/ik11/inputs/access-om2/input_bgc_20211105/01deg-cycle4. I made them with https://github.com/COSIMA/input_om2-bgc/tree/afad69d which has a slightly tweaked version of the notebook.

The restart files are here: /g/data/v45/aek156/input_om2-bgc/*.nc

hakaseh commented 3 years ago

Good morning @aekiss. Thanks for generating the input and restart files for cycle 4.

I've just pushed https://github.com/COSIMA/01deg_jra55_iaf/tree/01deg_jra55v140_iaf_cycle4 after adding the missing CICE diags in ice/cice_in.nml (https://github.com/COSIMA/access-om2/issues/250) and also setting f_fswthru_ai to 'md', which is the diag needed for Denisse's study.

hakaseh commented 3 years ago

@aekiss do you want to do the test run? i'm happy for you to, as you know better about 01-degree runs than I do. but if you are busy, I can try.

aekiss commented 2 years ago

@hakaseh the 1-month test run seemed to work, and was quite quick, perhaps because I've removed the passive tracers:

======================================================================================
                  Resource Usage on 2021-11-19 13:17:18:
   Job Id:             31386601.gadi-pbs
   Project:            x77
   Exit Status:        0
   Service Units:      19693.52
   NCPUs Requested:    12144                  NCPUs Used: 12144
                                           CPU Time Used: 9253:30:09
   Memory Requested:   47.44TB               Memory Used: 12.08TB
   Walltime requested: 03:00:00            Walltime Used: 00:48:39
   JobFS requested:    24.71GB                JobFS used: 12.16MB
======================================================================================

It would be great if you could have a careful look at the way I've set up the run and also check that the BGC outputs look sensible. control dir: /home/156/aek156/payu/01deg_jra55v140_iaf_cycle4 output: /scratch/v45/aek156/access-om2/archive/01deg_jra55v140_iaf_cycle4

hakaseh commented 2 years ago

Awesome!! 😸

I don't seem to have access to the output directory.

(base) [hh0162@gadi-login-05 ~]$ ls /scratch/v45/aek156/access-om2/archive/01deg_jra55v140_iaf_cycle4/output732/
ls: cannot open directory '/scratch/v45/aek156/access-om2/archive/01deg_jra55v140_iaf_cycle4/output732/': Permission denied
aekiss commented 2 years ago

oops, try again, I've changed it to v45

hakaseh commented 2 years ago

good morning @aekiss i've looked at some of the ocean and sea ice diags using ncview and they all look reasonable to me 👍 i'll work on conservation check now and will get back to you later today.

hakaseh commented 2 years ago

conservation check is still in progress, but while doing that, i found out an issue with the current code implementation of ice-to-ocean BGC flux to the ocean: https://github.com/mom-ocean/MOM5/commit/d46ab0e604d9bdfb19ff1ce438e7de8d300073f8

This subroutine is called only when salt_restore_as_salt_flux = .true. otherwise, it will not be called and because the flux calculation is within this subroutine, it will not be called unless salt restoring is turned on. while we have this on usually, i'm fixing this now.

this should help me with conservation check too. currently i struggle to check partly because i think salt restoring also restores BGC tracers, as formulated in the above subroutine. so switching this off helps for conservation check, i think.

aekiss commented 2 years ago

Well spotted. Looks like the added code will not be run even if the subroutine is called, since we use use_waterflux = .true. - see line 1347

hakaseh commented 2 years ago

oh wait, i was point to the wrong branch 💦 here is the correct one: https://github.com/hakaseh/mom/commit/3c34fd2b2708b51465aa0a0fbe3a9d2b8fb80704

aekiss commented 2 years ago

Ah ok, that makes more sense. So it looks like the old code didn't calculate ice-to-ocean flux, because we use use_waterflux = .true.? That should have been evident from conservation checks.

I've also commented on your commit: https://github.com/hakaseh/mom/commit/3c34fd2b2708b51465aa0a0fbe3a9d2b8fb80704#commitcomment-60659866

hakaseh commented 2 years ago

yes, correct. i noticed it when doing conservation checks previously.

my current attempt is to compute ice-to-ocean flux outside of virtual or salt restoring-related flux loops: https://github.com/hakaseh/mom/commit/dbc52a5aefa3bca893a64d978d90bfcc0b46a098

hakaseh commented 2 years ago

@aekiss conservation check has been challenging, but I find that the drift of the total mass (N+P+Z+D) in the ocean between with and without sea-ice BGC is negligible compared to the background level (attached). Therefore, I think it's ok to proceed as is. Screen Shot 2021-11-24 at 9 56 56 am

My test run indicates that this code change (https://github.com/hakaseh/mom/commit/dbc52a5aefa3bca893a64d978d90bfcc0b46a098) has no impact as long as salt_restore_as_salt_flux = .true., but for potential future application without salt restoring, could you please recompile the code reflecting that code change?

That should be it :)

aekiss commented 2 years ago

great, thanks @hakaseh! There have been some bug fixes on master, so before I compile this, can you merge master into iamip2-hh and push to github?

aekiss commented 2 years ago

when you've done that, please do a PR into mom-ocean:master so it will run the automatic checks

hakaseh commented 2 years ago

done it 👍

aekiss commented 2 years ago

thanks, all tests passed so I've compiled it here /g/data/ik11/inputs/access-om2/bin/fms_ACCESS-OM-BGC_13ea3a6_libaccessom2_0ab7295.x and updated config.yaml: https://github.com/COSIMA/01deg_jra55_iaf/commit/7c886f2cc9a5fc5ea98f7bfffdf9ddbe549310ab

hakaseh commented 2 years ago

awesome, thanks! i can update the master+bgc as well as cycle4.

hakaseh commented 2 years ago

oh, i see that you did for cycle4. i'm happy to do fo master+bgc if you haven't done it.

aekiss commented 2 years ago

Hi @hakase I've restarted cycle 4 with the new mom exe: control dir: /home/156/aek156/payu/01deg_jra55v140_iaf_cycle4 output: /scratch/v45/aek156/access-om2/archive/01deg_jra55v140_iaf_cycle4 seems to work.

hakaseh commented 2 years ago

@aekiss i had a quick look and the BGC output looks reasonable. i compared stf07 between test-output732 and output732 and they are identical, so that's good (expected).

so we have officially started the run now? 🥳

PaulSpence commented 2 years ago

Hi Hakase and Andrew,

Exciting progress gentlemen :). I will kick off a 035 ryf run today as well.

Thank you Paul

On Thu, Nov 25, 2021, 7:28 AM Hakase Hayashida @.***> wrote:

@aekiss https://github.com/aekiss i had a quick look and the BGC output looks reasonable. i compared stf07 between test-output732 and output732 and they are identical, so that's good (expected).

so we have officially started the run now? 🥳

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/COSIMA/01deg_jra55_iaf/issues/11#issuecomment-978198015, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSWJXBSHMRMOLYA66ZYW5LUNVDGBANCNFSM5HHXBJCA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

hakaseh commented 2 years ago

@PaulSpence let me know if you need help setting up your run. master+bgc should be stable now.

oh, i see that you did for cycle4. i'm happy to do fo master+bgc if you haven't done it.

this is done now (e.g. https://github.com/COSIMA/01deg_jra55_iaf/commit/3227e43d7724554938f65b32004568ea2fbced6c).

aekiss commented 2 years ago

Thanks @PaulSpence, I'm happy to help out. It should be set up and ready to go with

git clone https://github.com/COSIMA/025deg_jra55_ryf.git
cd 025deg_jra55_ryf
git checkout master+bgc
aekiss commented 2 years ago

@hakaseh the MOM master has been updated again https://github.com/mom-ocean/MOM5/pull/352 - if you could merge that into iamip2-hh I'll recompile it

hakaseh commented 2 years ago

@aekiss done it. do i need to do a PR?