NCAR / DART_CASES

DART CASE directories from CESM experiments.
0 stars 3 forks source link

Request to rerun 2019-09-30 and 2019-10 to replace lost files. #86

Closed kdraeder closed 4 years ago

kdraeder commented 4 years ago

Since this is not the normal workflow (I'm rerunning a month from archived ICs), it's worth some extra scrutiny. If you have time and interest, check out /glade/work/raeder/Exp/f.e21.FHIST_BGC.f09_025.CAM6assim.011 and /glade/scratch/raeder/f.e21.FHIST_BGC.f09_025.CAM6assim.011/run andor ask me questions about how I set it up. I'm running a single day (premium) first to make the run of October match the standard, whole month format. It's also less likely that 32 days would finish in 1 job, and it would be little messier to have a job with 1 day of Sep and not all of the days of Oct. I'll run Oct in economy.

env_batch.xml

Running the last day of 2019-09 to create the ICs for 2019-10 in premium to set up 2019-10 sooner rather than later.

repack_project.csh

Helpful comment.

stage_cesm_files

Added section to check for the presence of inflation restart files.

timhoar commented 4 years ago

I think you are still requesting 12 hours for a single day (4 cycles)

0[1997] cheyenne6:/<3>Exp/f.e21.FHIST_BGC.f09_025.CAM6assim.011 > ./xmlquery --partial CONTINUE

Results in group run_begin_stop_restart
    CONTINUE_RUN: TRUE
    RESUBMIT_SETS_CONTINUE_RUN: TRUE
0[1998] cheyenne6:/<3>Exp/f.e21.FHIST_BGC.f09_025.CAM6assim.011 > ./xmlquery --partial RESUBMIT

Results in group run_begin_stop_restart
    RESUBMIT: 0
    RESUBMIT_SETS_CONTINUE_RUN: TRUE
0[1999] cheyenne6:/<3>Exp/f.e21.FHIST_BGC.f09_025.CAM6assim.011 > ./xmlquery --partial DOUT

Results in group run_data_archive
    DOUT_S: TRUE
    DOUT_S_SAVE_INTERIM_RESTART_FILES: FALSE

Results in group run_dout
    DOUT_S_ROOT: /glade/scratch/raeder/f.e21.FHIST_BGC.f09_025.CAM6assim.011/archive
0[2000] cheyenne6:/<3>Exp/f.e21.FHIST_BGC.f09_025.CAM6assim.011 > ./xmlquery --partial STOP

Results in group run_begin_stop_restart
    STOP_DATE: -999
    STOP_N: 6
    STOP_OPTION: nhours
0[2001] cheyenne6:/<3>Exp/f.e21.FHIST_BGC.f09_025.CAM6assim.011 > ./xmlquery --partial ASSIM

Results in group external_tools
    DATA_ASSIMILATION: ['CPL:FALSE', 'ATM:TRUE', 'LND:FALSE', 'ICE:FALSE', 'OCN:FALSE', 'ROF:FALSE', 'GLC:FALSE', 'WAV:FALSE']
    DATA_ASSIMILATION_CYCLES: 4
    DATA_ASSIMILATION_SCRIPT: /glade/work/raeder/Exp/f.e21.FHIST_BGC.f09_025.CAM6assim.011/assimilate.csh
0[2002] cheyenne6:/<3>Exp/f.e21.FHIST_BGC.f09_025.CAM6assim.011 > ./xmlquery --partial WALL

Results in group case.run
    JOB_WALLCLOCK_TIME: 12:00:00
    USER_REQUESTED_WALLTIME: 12:00

Results in group case.st_archive
    JOB_WALLCLOCK_TIME: 6:00:00
    USER_REQUESTED_WALLTIME: 6:00
0[2003] cheyenne6:/<3>Exp/f.e21.FHIST_BGC.f09_025.CAM6assim.011 > 

But outside of that, I think it all looks good.

timhoar commented 4 years ago

One more thing ... are there too many inflation files?

0[2002] cheyenne6:/<3>f.e21.FHIST_BGC.f09_025.CAM6assim.011/run > ls -l *priorinf*
-rw-r--r-- 1 raeder p86850054 85312292 Jul  7 18:44 f.e21.FHIST_BGC.f09_025.CAM6assim.011.dart.rh.cam_output_priorinf_mean.2019-09-30-00000.nc
-rw-r--r-- 1 raeder p86850054 85311988 Aug 11 10:08 f.e21.FHIST_BGC.f09_025.CAM6assim.011.dart.rh.cam_output_priorinf_mean.2019-12-31-64800.nc
-rw-r--r-- 1 raeder p86850054 85311988 Aug 11 10:14 f.e21.FHIST_BGC.f09_025.CAM6assim.011.dart.rh.cam_output_priorinf_mean.2020-01-01-00000.nc
-rw-r--r-- 1 raeder p86850054 85312292 Jul  7 18:44 f.e21.FHIST_BGC.f09_025.CAM6assim.011.dart.rh.cam_output_priorinf_sd.2019-09-30-00000.nc
-rw-r--r-- 1 raeder p86850054 85311988 Aug 11 10:08 f.e21.FHIST_BGC.f09_025.CAM6assim.011.dart.rh.cam_output_priorinf_sd.2019-12-31-64800.nc
-rw-r--r-- 1 raeder p86850054 85311988 Aug 11 10:14 f.e21.FHIST_BGC.f09_025.CAM6assim.011.dart.rh.cam_output_priorinf_sd.2020-01-01-00000.nc
-rw-r--r-- 1 raeder p86850054 85311988 Aug 11 10:08 input_priorinf_mean.nc
-rw-r--r-- 1 raeder p86850054 85311988 Aug 11 10:08 input_priorinf_sd.nc
0[2003] cheyenne6:/<3>f.e21.FHIST_BGC.f09_025.CAM6assim.011/run > 

I am looking at the assimilate script to see which one it picks up ... might not be picking up the right date.

timhoar commented 4 years ago

You will need to move the other inflation files out of the way ... the assimilate script will pick up the wrong one.

5.CAM6assim.011/run > ls -rt1 f.e21.FHIST_BGC.f09_025.CAM6assim.011.dart.rh.cam_output_priorinf_mean* | tail -n 1
f.e21.FHIST_BGC.f09_025.CAM6assim.011.dart.rh.cam_output_priorinf_mean.2020-01-01-00000.nc
0[2004] cheyenne6:/<3>f.e21.FHIST_BGC.f09_025.CAM6assim.011/run > 
kdraeder commented 4 years ago

Good catches! Ben, I'm running the first day in premium because there is a little pressure to finish this data set so that we can submit proposal(s) for publication. I plan to run the full month in economy, and that might take a few days of waiting, but it's a much bigger difference in core-hours.

I'll change the wall clock from 12 hours to 1, which was my intention. The presubmit script chose a good time automatically in the past, caveat cheyenne performance. That's what's stuck in my brain.

Yes, I should clean out the extra inflation files. but

kdraeder commented 4 years ago

Doh, I got distracted and didn't proof the message before sending. 'but' didn't belong. I removed the extra inflation files, and made a note in my assimilate.csh that it should probably be upgraded to look for inflation files with a date, rather than looking (just) for the most recent files.

timhoar commented 4 years ago

OK - so the wallclock is set to 1 hour, the correct inflation file will get used ... I think its good to go.