noaa-oar-arl / UFS-Aerosol-Config

NOAA OAR repository of UFS-Aerosol configuration files and cases
MIT License
2 stars 3 forks source link

P7.1 preliminary experiments #14

Open rmontuoro opened 3 years ago

rmontuoro commented 3 years ago

Description:

Model Updates

As code changes are being reviewed and merged in the UFS weather model authoritative repository, I have created two feature branches in my ufs-weather-model fork including the latest model updates to begin running preliminary P7.1 experiments:

Both of these branches also include:

Please checkout either of those branches as follows:

git clone -b feature/p7.1 --recursive https://github.com/rmontuoro/ufs-weather-model.git

or:

git clone -b feature/p7.1-gocart-dev --recursive https://github.com/rmontuoro/ufs-weather-model.git

Workflow Updates

To start testing the branches above with a preliminary (fully-coupled) P7.1 configuration, I've also created a new global workflow branch in my forked repository. This branch supersedes the current feature/aerosols and includes a draft P7.1 case file:

You may checkout this global workflow update as:

git clone -b feature/p7.1 --recursive https://github.com/rmontuoro/global-workflow.git    

To use one of the branches above, please download the source code as follows:

cd sorc/
sh checkout.sh -c -r <branch> -u https://github.com/rmontuoro/ufs-weather-model.git

then build using option -c for either atmosphere-aerosol or fully-coupled (P7.1) configurations, or -a only for atmosphere-aerosols runs:

sh build_all.sh -c

Aerosol Configuration Files

Updated configuration files for GOCART 2.0 rc1 (ufs-weather-model branch feature/p7.1-gocart-dev above) are available in the rc.hera folder of this repository's branch feature/p7.1-gocart-dev.

These files can be used as a template for P7.1 experiments. Note that brown carbon (br) has been disabled in current configurations. However, the component can be turned on for testing purposes using the following settings in GOCART2G_GridComp.rc:

ACTIVE_INSTANCES_CA:   CA.oc  CA.bc  CA.br

and configuration file CA2G_instance_CA.br.rc. Two additional tracers also need to be added as done for the BC and OC components.

bbakernoaa commented 3 years ago

Amazing Raffaele!

I'll try to get a run going sometime this weekend. Testing the before and after (both including the fengsha fix).

zhanglikate commented 3 years ago

@rmontuoro Thanks for the updates. I will try the branch of feature/p7.1-gocart-dev. Do you have a example of the new *rc file folder? Thanks.

lipan-NOAA commented 3 years ago

Just a few questions about rc.hera:

  1. For fengsha, alpah is 0.4; the previous one is 0.7 ?
  2. For Seasalt, the emission_scale is 0.613 0.613 0.613 0.429 0.429 0.429; the previous one is 1.0 for all ?
  3. In GOCART2G_GridComp.rc, wavelengths_for_profile_aop_in_nm: 470 550 870 ? should be wavelengths_for_profile_aop_in_nm: 470 550 670 870 ?

Li

bbakernoaa commented 3 years ago

All I’m putting together new rc files following Raffaele’s branch. I’ll push them today or torrid morning.

On Tue, Aug 10, 2021 at 3:27 PM lipan-NOAA @.***> wrote:

Just a few questions about rc.hera:

  1. For fengsha, alpah is 0.4; the previous one is 0.7 ?
  2. For Seasalt, the emission_scale is 0.613 0.613 0.613 0.429 0.429 0.429; the previous one is 1.0 for all ?
  3. In GOCART2G_GridComp.rc, wavelengths_for_profile_aop_in_nm: 470 550 870 ? should be wavelengths_for_profile_aop_in_nm: 470 550 670 870 ?

Li

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/noaa-oar-arl/UFS-Aerosol-Config/issues/14#issuecomment-896257478, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIUVNZHPNBVJVY6TFD6GX3T4F4SJANCNFSM5BWN6ERQ .

rmontuoro commented 3 years ago

@lipan-NOAA - Please use rc files provided in branch feature/p7.1-gocart-dev of this repository, as well as the latest model updates in branch feature/p7.1-gocart-dev, as mentioned above.

@bbakernoaa - thanks for adding new rc files to my branch. Everyone, please use these updated files from now on.

lipan-NOAA commented 3 years ago

I got this kind error message when I run gfs.forecast.highres: FATAL from PE 199: check_nml_error in fms_mod: Unknown namelist, or mistyped namelist variable in namelist atmos_model_nml, (IOSTAT = 19 )

Rundir: /scratch2/NCEPDEV/stmp1/Li.Pan/chem202107/firex Sourcedir: /scratch2/NCEPDEV/naqfc/Li.Pan/save/aerosol-workflow

Thanks,

Li

rmontuoro commented 3 years ago

@lipan-NOAA - I've updated branch feature/p7.1 of the global workflow to resolve this issue, which originated from namelist settings inconsistent with recent updates of FV3.

zhanglikate commented 3 years ago

All I’m putting together new rc files following Raffaele’s branch. I’ll push them today or torrid morning.

@bbakernoaa I am wondering did you push your updates of alpha and gamma? What I see now is 0.7 and 1.0. Thanks.

lipan-NOAA commented 3 years ago

Hello Raffaele,

After updating, I encountered two problems:

  1. fcst job failed because of : slurmstepd: error: STEP 21566043.0 ON h1c20 CANCELLED AT 2021-08-12T18:44:21 application called MPI_Abort(comm=0x84000002, 1) - process 166

    application called MPI_Abort(comm=0x84000002, 1) - process 193 srun: Job step aborted: Waiting up to 32 seconds for job step to finish. srun: error: h1c28: tasks 40,55,72: Killed srun: launch/slurm: _step_signal: Terminating StepId=21566043.0

2, ics file failed because of : echo 'FATAL: Unable to copy /scratch2/NCEPDEV/stmp1/Li.Pan/CPC3Dvar/2019070100/ocn/025/MOM.nc to /scratch1/NCEPDEV/stmp4/Li.Pan/COM/firex/FV3ICS/2019070100/ocn/ (Error code 1)' FATAL: Unable to copy /scratch2/NCEPDEV/stmp1/Li.Pan/CPC3Dvar/2019070100/ocn/025/MOM.nc to /scratch1/NCEPDEV/stmp4/Li.Pan/COM/firex/FV3ICS/2019070100/ocn/ (Error code 1)

Thanks,

Li

lipan-NOAA commented 3 years ago

I tried prototype7.1, ics and gfs.prep are success, but forecasting is failed. The log file is in /scratch1/NCEPDEV/stmp4/Li.Pan/COM/pro71/logs/2013040100.

Li

rmontuoro commented 3 years ago

/scratch1/NCEPDEV/stmp4/Li.Pan/COM/pro71/logs/2013040100

@lipan-NOAA - Your input emissions in AERO_ExtData.rc are defined on 64 layers while you are running the model with 127 layers. ExtData cannot vertically interpolate input data (see error below):

pe=00549 FAIL at line=01181    MAPL_ExtDataGridCompMod.F90              <Surface pressure not present for vertical interpolation>
pe=00549 FAIL at line=01840    MAPL_Generic.F90                         <needs informative message>
pe=00549 FAIL at line=00784    MAPL_CapGridComp.F90                     <status=1>

Please use 127-layer emissions (look in AERO_ExtData.rc and replace L64 with L127 and z64 with z127).

bbakernoaa commented 3 years ago

@zhanglikate I have not pushed them yet. I was testing them with the new dev branch and ran into the same issue you did. Once it is resolved I'll push them. I'll have AERO_ExtData.rc files for different emission scenarios too. HTAP, CEDS2014 + HTAP, CEDS 2019 + HTAP

lipan-NOAA commented 3 years ago

@rmontuoro - I modified AERO_ExtData.rc and run again. Fcst job failed again. The log file is in /scratch1/NCEPDEV/stmp4/Li.Pan/COM/pro71/logs/2013040100/gfs.forecast.highres.log.

zhanglikate commented 3 years ago

@rmontuoro

    In the ush/forecast_postdet.sh file, I added back the following sentences as the old workflow:

    # If the appropriate resolution fix file is not present, use the highest resolution available (T1534)
    [[ ! -f $FNALBC ]] && FNALBC="$FIX_AM/global_snowfree_albedo.bosu.t1534.3072.1536.rg.grb"
    [[ ! -f $FNVETC ]] && FNVETC="$FIX_AM/global_vegtype.igbp.t1534.3072.1536.rg.grb"
    [[ ! -f $FNSOTC ]] && FNSOTC="$FIX_AM/global_soiltype.statsgo.t1534.3072.1536.rg.grb"
    [[ ! -f $FNABSC ]] && FNABSC="$FIX_AM/global_mxsnoalb.uariz.t1534.3072.1536.rg.grb"
    [[ ! -f $FNSMCC ]] && FNSMCC="$FIX_AM/global_soilmgldas.statsgo.t1534.3072.1536.grb"

Now there is not the error as before, however, the run still cash by showing below:

The log file is /scratch1/BMC/gsd-fv3-dev/NCEPDEV/global/Kate.Zhang/fv3gfs/comrot/NASA_C96_fire/logs/2016071500/gfs.forecast.highres.log

dbgx --fixratio: F F F F CA cubic mosaic domain decomposition whalo = 1, ehalo = 1, shalo = 1, nhalo = 1 X-AXIS = 160 160 160 160 160 160 Y-AXIS = 120 120 120 120 120 120 120 120 end of assign_importdata forrtl: severe (174): SIGSEGV, segmentation fault occurred longjmp causes uninitialized stack frame : /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model terminated ======= Backtrace: ========= /lib64/libc.so.6(fortify_fail+0x37)[0x2b99f06fc697] /lib64/libc.so.6(+0x1185ad)[0x2b99f06fc5ad] /lib64/libc.so.6(__longjmp_chk+0x29)[0x2b99f06fc509] /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model[0x4434602] /lib64/libpthread.so.0(+0xf630)[0x2b99f01c1630] /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model[0x396c04a] /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model[0x396f942] /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model[0x396e84f] /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model[0x3960ebd] /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model[0x395a699] /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model[0x393845d] /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model[0x381d434] /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model[0x36b1058] /scratch2/BMC/gsd-fv3-dev/NCEPDEV/stmp3/Kate.Zhang/RUNDIRS/NASA_C96_fire/2016071500/gfs/fcst.267402/ufs_model[0x36af301] /apps/intel/parallel_studio_xe_2018.4.057/compilers_and_libraries_2018/linux/compiler/lib/intel64/libiomp5.so(kmp_invoke_microtask+0x93)[0x2b99ee5c9a43]

zhanglikate commented 3 years ago

The *.rc file of the feature/p7.1 can be found in the develop branch in https://github.com/noaa-oar-arl/UFS-Aerosol-Config/tree/develop/rc.hera

bbakernoaa commented 3 years ago

Once we get the p7.1-gocart-dev working I'll merge the develop branch with it and add the additional speciation we need for AERO_ExtData.rc and other changes needed.

bbakernoaa commented 3 years ago

@rmontuoro @zhanglikate @lipan-NOAA

I have put together some rc files for the feature/p7.1-gocart-dev branch. These needed several updates but have been included here and have been tested to work with the newest branch.

Please use the feature/p7.1-gocart-dev branch of config files here.

@zhanglikate can you point me to your latest case.yaml file for Atom1. I'm going to put them together for each case for both fully coupled and atmosphere-aerosol only.

zhanglikate commented 3 years ago

@bbakernoaa Thanks.

zhanglikate commented 3 years ago

@bbakernoaa I never changed the case the file in my runs when set up the workflow job submit path. For the wet scavenging values, I just modified the config.fcst, all of them in my recent EXP. have been summarized in https://docs.google.com/document/d/1EWZV4VOvvx5d4rpjXp5dGJlP7Pa67HuA_GaC9xuGugw/edit

lipan-NOAA commented 3 years ago

@bbakernoaa @zhanglikate @rmontuoro

Yesterday, I ran a testing case for phototype7.1. The simulation starting date is 2017-07-15. The running duration is 840 hours (35 days). The aerosol initial mixing ratio is set to zero. This run can provide initial conditions for 2017-08-01 and 2017-08-15.

This run was failed after several tries, at the same time the run does run through about 15 days

RUNDIR = /scratch2/NCEPDEV/stmp1/Li.Pan/chem202108/pro71; LOGDIR = /scratch2/NCEPDEV/stmp1/Li.Pan/COM/pro71/logs/2017071500/gfs.forecast.highres.log;

UFS Aerosols: Advancing from 2017-07-30T09:30:00 to 2017-07-30T09:35:00 in fcst run phase 2, na= 4434 zeroing coupling accumulated fields at kdt= 4435 PASS: fcstRUN phase 2, na = 4434 time is 66.8428111076355 n fv3_cap,in model run, advance,na= 4435

-->Advancing WAV from: 2017 7 30 9 35 0 0 -----------------> to: 2017 7 30 9 40 0 0 srun: error: h4c49: task 473: Killed srun: launch/slurm: _step_signal: Terminating StepId=22782480.0 slurmstepd: error: STEP 22782480.0 ON h4c03 CANCELLED AT 2021-09-08T02:13:26 forrtl: error (78): process killed (SIGTERM) Image PC Routine Line Source ufs_model 0000000005A4F9DF Unknown Unknown Unknown libpthread-2.17.s 00002B11E19B8630 Unknown Unknown Unknown libpthread-2.17.s 00002B11E19B5570 pthread_spin_lock Unknown Unknown forrtl: error (78): process killed (SIGTERM) Image PC Routine Line Source ufs_model 0000000005A4F9DF Unknown Unknown Unknown libpthread-2.17.s 00002B75B65F4630 Unknown Unknown Unknown libmpi.so.12.0 00002B75B5D031D9 Unknown Unknown Unknown libmpi.so.12.0 00002B75B5D0102E Unknown Unknown Unknown libmpi.so.12.0 00002B75B5CCABD0 Unknown Unknown Unknown libmpi.so.12 00002B75B5A8EBC8 PMPIDI_CH3I_Progr Unknown Unknown libmpi.so.12.0 00002B75B5CCB8FD Unknown Unknown Unknown libmpi.so.12 00002B75B5D7CC42 PMPI_Probe Unknown Unknown ufs_model 000000000082752A _ZN5ESMCI3VMK4rec 4206 ESMCI_VMKernel.C ufs_model 0000000000F10109 _ZN5ESMCI3XXE4exe 4067 ESMCI_DELayout.C ufs_model 0000000000F0E878 _ZN5ESMCI3XXE4exe 5391 ESMCI_DELayout.C ufs_model 0000000000F0E878 _ZN5ESMCI3XXE4exe 5391 ESMCI_DELayout.C ufs_model 000000000133C0B9 _ZN5ESMCI11ArrayB 1676 ESMCI_ArrayBundle.C ufs_model 0000000000A27852 c_esmc_arraybundl 717 ESMCI_ArrayBundle_F.C ufs_model 000000000074FF82 esmf_arraybundlem 2945 ESMF_ArrayBundle.F90 ufs_model 000000000071276A esmf_fieldbundlem 16559 ESMF_FieldBundle.F90 ufs_model 0000000000DE10B7 nuopc_connector_m 6250 NUOPC_Connector.F90 ufs_model 000000000096EF8E _ZN5ESMCI6FTable1 2036 ESMCI_FTable.C ufs_model 0000000000972BD6 ESMCI_FTableCallE 765 ESMCI_FTable.C ufs_model 000000000075700A _ZN5ESMCI2VM5ente 1211 ESMCI_VM.C ufs_model 0000000000970627 c_esmc_ftablecall 922 ESMCI_FTable.C ufs_model 00000000007DF411 esmf_compmod_mp_e 1214 ESMF_Comp.F90 ufs_model 0000000000B5AD44 esmf_cplcompmod_m 1641 ESMF_CplComp.F90 ufs_model 000000000076ADC6 nuopc_driver_mp_r 3368 NUOPC_Driver.F90 ufs_model 000000000076C9CA nuopc_driver_mp_e 3565 NUOPC_Driver.F90 ufs_model 0000000001124A3F _ZNK5ESMCI13Metho 333 ESMCI_MethodTable.C ufs_model 00000000011249C2 _ZN5ESMCI11Method 519 ESMCI_MethodTable.C ufs_model 0000000001122EC6 c_esmc_methodtabl 273 ESMCI_MethodTable.C ufs_model 0000000000961750 esmf_attachmethod 1284 ESMF_AttachMethods.F90 ufs_model 000000000076A3DF nuopc_driver_mp_r 3234 NUOPC_Driver.F90 ufs_model 000000000096EF8E _ZN5ESMCI6FTable1 2036 ESMCI_FTable.C ufs_model 0000000000972BD6 ESMCI_FTableCallE 765 ESMCI_FTable.C ufs_model 000000000075700A _ZN5ESMCI2VM5ente 1211 ESMCI_VM.C ufs_model 0000000000970627 c_esmc_ftablecall 922 ESMCI_FTable.C ufs_model 00000000007DF411 esmf_compmod_mp_e 1214 ESMF_Comp.F90 ufs_model 0000000000B62834 esmfgridcompmod 1889 ESMF_GridComp.F90 ufs_model 000000000041D016 MAIN 441 MAIN_NEMS.F90 ufs_model 000000000041BB1E Unknown Unknown Unknown libc-2.17.so 00002B75B6A39555 libc_start_main Unknown Unknown ufs_model 000000000041BA29 Unknown Unknown Unknown forrtl: error (78): process killed (SIGTERM)

zhanglikate commented 3 years ago

@rmontuoro @bbakernoaa I am wondering that Li Pan's test is what you want us to run in the meeting of last Friday (in the table), right? If it is, can you help to fix this issue ? Also, can you let me know more details about this running cases, for instance, the date or others. I think it would be better to have table to list who is responsible to which part, then we will not do duplicated testing. Thanks.

lipan-NOAA commented 3 years ago

@bbakernoaa @zhanglikate @rmontuoro I am not sure which model causes this crash. I will set a similar simulation with "atmosphere only" case to see if model can complete successfully.

zhanglikate commented 3 years ago

@lipan-NOAA For this fully coupled run, which case file did you use?

zhanglikate commented 3 years ago

@rmontuoro For the testing of P7 that we discuss last Friday, which case file did you suggest to use?

lipan-NOAA commented 3 years ago

@zhanglikate

/scratch2/NCEPDEV/naqfc/Li.Pan/save/global-workflow/workflow/cases/prototype7.1.yaml

lipan-NOAA commented 3 years ago

@rmontuoro @bbakernoaa @zhanglikate

In order to figure out why model was dead in prototype7.1, I run a test using firex setting (without coupling, no cycles) for 35 days. The forecasting job was failed after 5 days. The logfile is /scratch2/NCEPDEV/stmp1/Li.Pan/COM/firexs2s/logs/2019060100.

The similar error message : CA cubic mosaic domain decomposition whalo = 1, ehalo = 1, shalo = 1, nhalo = 1 X-AXIS = 320 320 320 320 320 320 320 320 320 320 320 320 Y-AXIS = 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 end of assign_importdata PASS: fcstRUN phase 1, na = 1665 time is 20.0535998344421 UFS Aerosols: Advancing from 2019-06-06T18:45:00 to 2019-06-06T18:50:00 srun: error: h20c24: task 698: Killed srun: launch/slurm: _step_signal: Terminating StepId=22813362.0 slurmstepd: error: STEP 22813362.0 ON h5c17 CANCELLED AT 2021-09-08T21:29:16 forrtl: error (78): process killed (SIGTERM) Image PC Routine Line Source ufs_model 0000000005A4F9DF Unknown Unknown Unknown libpthread-2.17.s 00002AF4E3A55630 Unknown Unknown Unknown libmpi.so.12 00002AF4E2EEDC9E PMPIDI_CH3I_Progr Unknown Unknown libmpi.so.12.0 00002AF4E312C8FD Unknown Unknown Unknown libmpi.so.12 00002AF4E31DDC42 PMPI_Probe Unknown Unknown ufs_model 000000000082752A _ZN5ESMCI3VMK4rec 4206 ESMCI_VMKernel.C ufs_model 0000000000F10109 _ZN5ESMCI3XXE4exe 4067 ESMCI_DELayout.C ufs_model 0000000000F0E878 _ZN5ESMCI3XXE4exe 5391 ESMCI_DELayout.C ufs_model 000000000133C0B9 _ZN5ESMCI11ArrayB 1676 ESMCI_ArrayBundle.C ufs_model 0000000000A27852 c_esmc_arraybundl 717 ESMCI_ArrayBundle_F.C ufs_model 000000000074FF82 esmf_arraybundlem 2945 ESMF_ArrayBundle.F90 ufs_model 000000000071276A esmf_fieldbundlem 16559 ESMF_FieldBundle.F90 ufs_model 00000000007120F0 esmf_fieldbundlem 15308 ESMF_FieldBundle.F90 ufs_model 000000000284610D fv3gfs_cap_mod_mp 1116 fv3_cap.F90 ufs_model 0000000001124A3F _ZNK5ESMCI13Metho 333 ESMCI_MethodTable.C ufs_model 00000000011249C2 _ZN5ESMCI11Method 519 ESMCI_MethodTable.C ufs_model 0000000001125020 c_esmc_methodtabl 303 ESMCI_MethodTable.C ufs_model 0000000000961669 esmf_attachmethod 1277 ESMF_AttachMethods.F90 ufs_model 00000000059A0FC9 nuopc_modelbase_m 2788 NUOPC_ModelBase.F90 ufs_model 000000000096EF8E _ZN5ESMCI6FTable1 2036 ESMCI_FTable.C ufs_model 0000000000972BD6 ESMCI_FTableCallE 765 ESMCI_FTable.C ufs_model 000000000075700A _ZN5ESMCI2VM5ente 1211 ESMCI_VM.C ufs_model 0000000000970627 c_esmc_ftablecall 922 ESMCI_FTable.C ufs_model 00000000007DF411 esmf_compmod_mp_e 1214 ESMF_Comp.F90 ufs_model 0000000000B62834 esmfgridcompmod 1889 ESMF_GridComp.F90 ufs_model 000000000076DC99 nuopc_driver_mp_r 3313 NUOPC_Driver.F90 ufs_model 000000000076D4F5 nuopc_driver_mp_e 3583 NUOPC_Driver.F90 ufs_model 0000000001124A3F _ZNK5ESMCI13Metho 333 ESMCI_MethodTable.C

zhanglikate commented 3 years ago

@rmontuoro For the ATM only, did you try a free run more than 10 days for C384L127 when you release the P7.1 version? I can try a quick ATM test as Li Pan did for C96L64 for 10 days. It will be very quick.

lipan-NOAA commented 3 years ago

@zhanglikate my run is C384L127

zhanglikate commented 3 years ago

@lipan-NOAA There is no problem for C96L64 in a 10 days free run. I will try C384L127.

zhanglikate commented 3 years ago

@rmontuoro @bbakernoaa @lipan-NOAA , @bbakernoaa Did you get your run finish? My ATM only run with aerosol case file crashed at the 7th day showing similar error as Li Pan's run, see below. My log file is in /scratch2/NCEPDEV/naqfc/Kate.Zhang/fv3gfs/comrot/DEV_C384_fire/logs/2016060100/gfs.forecast.highres.log

whalo = 1, ehalo = 1, shalo = 1, nhalo = 1 X-AXIS = 320 320 320 320 320 320 320 320 320 320 320 320 X-AXIS = 320 320 320 320 320 320 320 320 320 320 320 320 Y-AXIS = 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 Y-AXIS = 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 end of assign_importdata PASS: fcstRUN phase 1, na = 1170 time is 2.72574901580811 UFS Aerosols: Advancing from 2016-06-07T02:15:00 to 2016-06-07T02:22:30 srun: error: Node failure on h25c01 srun: Job step aborted: Waiting up to 32 seconds for job step to finish. slurmstepd: error: STEP 22854424.0 ON h13c03 CANCELLED AT 2021-09-09T21:44:08 DUE TO NODE FAILURE, SEE SLURMCTLD LOG FOR DETAILS slurmstepd: error: JOB 22854424 ON h13c03 CANCELLED AT 2021-09-09T21:44:08 DUE TO NODE FAILURE, SEE SLURMCTLD LOG FOR DETAILS forrtl: error (78): process killed (SIGTERM) Image PC Routine Line Source ufs_model 00000000043733FF Unknown Unknown Unknown libpthread-2.17.s 00002B3A73A98630 Unknown Unknown Unknown libpthread-2.17.s 00002B3A73A95573 pthread_spin_lock Unknown Unknown forrtl: error (78): process killed (SIGTERM) Image PC Routine Line Source ufs_model 00000000043733FF Unknown Unknown Unknown libpthread-2.17.s 00002B338010E630 Unknown Unknown Unknown libpthread-2.17.s 00002B338010B573 pthread_spin_lock Unknown Unknown forrtl: error (78): process killed (SIGTERM) Image PC Routine Line Source ufs_model 00000000043733FF Unknown Unknown Unknown libpthread-2.17.s 00002B4F0755E630 Unknown Unknown Unknown libpthread-2.17.s 00002B4F0755B573 pthread_spin_lock Unknown Unknown forrtl: error (78): process killed (SIGTERM) Image PC Routine Line Source ufs_model 00000000043733FF Unknown Unknown Unknown libpthread-2.17.s 00002B084F260630 Unknown Unknown Unknown

bbakernoaa commented 3 years ago

All I think that this has been resolved with the latest coupled model revision that @rmontuoro completed. I'm going to keep this issue open unless @rmontuoro thinks we should close it

bbakernoaa commented 3 years ago

@zhanglikate @rmontuoro @lipan-NOAA

P7.1 Runs

All, as we discussed I have created a single YAML file for the P7.1 runs. It can be found here:

/scratch2/NAGAPE/arl/Barry.Baker/gwp71/workflow/cases/prototype7.1.yaml

My workflow is the latest feature/p7.1 branch: commit 241bf58077

I'm also using the latest feature/p7.1-update branch of the ufs_coupled.fd: commit 021e91f34db

I have already stored the ICs and modified the atmosphere ICs to include the MERRA2 realtime data. They can be found here:

/scratch2/NCEPDEV/naqfc/Barry.Baker/ICs

Test case

Please try to perform this for the exact dates as in the yaml file as a test. I ran it under the debug queue 1 day and 11 hours.

The output is here:

/scratch2/NCEPDEV/naqfc/Barry.Baker/DATA/test/2016070100/gfs/fcst.275881

RUNS to be completed

Once we get computer resources we should be able to do this quickly. Currently, the ICs do not exist for dates after March 2018. I suggest that we not worry about completing these runs if they are not available. Let's just get done what we can.

So these are the RUNS that we agreed to....

YEAR MONTH DAY Assigned Person
2013 01 01 @zhanglikate
2013 07 01 @zhanglikate
2016 01 01 @bbakernoaa
2016 07 01 @bbakernoaa
2018 01 01 @lipan-NOAA

@gjfrost just an FYI

zhanglikate commented 3 years ago

@bbakernoaa Just want to confirm again, I think we only need to run the free run from Jan 1 and July 1, because this is S2S run, which require 35 days free run. So I don't think we need to run the Jan 15 and July 15 for cycling run as we discussed yesterday, right?

bbakernoaa commented 3 years ago

@zhanglikate It was my understanding that we are not cycling at all. We are doing free forecasts on the 1st and 15th of Jan and July of the specified years. @rmontuoro please correct me if I'm wrong and we will follow that.

zhanglikate commented 3 years ago

@bbakernoaa I think we only need to do free run for the 1st at current stage, we don't need to do it from 15th within the urgent schedule. While we 'd better set the restart file output at 14th, since it may not finish the run for 35 days. The 15th run is for others research in the future who may be interested for 15-days cycling run using the restart file output at 14th.

bbakernoaa commented 3 years ago

@zhanglikate As we discussed, right now we cannot restart due to reproducibility issues with the WAV and OCN components. We are just going to proceed as far as the run permits.

zhanglikate commented 3 years ago

@bbakernoaa @rmontuoro @lipan-NOAA For the limited the computing resource and schedule, let's just focus on the 1st free run now for the 3 years of Jan and July. Also, we there is not ICS for 2018. Can we do that for other years, such 2017 or 2015.

zhanglikate commented 3 years ago

@bbakernoaa Yes. We will not restart. So the 15th day is what @rmontuoro just mentioned for future, who may be interested in the cycling run. I think better not to confused here. Let's forget the 15th run at current stage.

bbakernoaa commented 3 years ago

@zhanglikate I've modified the table

zhanglikate commented 3 years ago

@bbakernoaa @lipan-NOAA @rmontuoro Also for the AOD output, to save space, I would suggest use 6 hours to be consistent as the model output. If you all agree, @bbakernoaa can you update the history rc file? We all will use the same rc file as yours. Thanks. #

Radiation-related diagnostics

# inst_aod.format: 'CFIO' , instaod.template: '%y4%m2%d2%h2%n2z.nc4' , inst_aod.archive: '%c/Y%y4' , inst_aod.mode: 'instantaneous' inst_aod.frequency: 010000, inst_aod.duration: 010000, inst_aod.ref_time: 000000,

bbakernoaa commented 3 years ago

Kate I think that this is very very small in comparison to everything else but I can modify it.

zhanglikate commented 3 years ago

@bbakernoaa May not occupy too much space. May save some time in the ESMF interpolation for output frequency. Thank you very much. I will just link to you rc file folder to created the workflow, this can help all of us using the same settings.

bbakernoaa commented 3 years ago

Yes you should be able to just change the date in the yaml file and run. We shouldn’t have to link anything.

On Fri, Sep 17, 2021 at 3:00 PM Kate.Zhang-NOAA @.***> wrote:

@bbakernoaa https://github.com/bbakernoaa May not occupy too much space. May save some time in the ESMF interpolation for output frequency. Thank you very much. I will just link to you rc file folder to created the workflow, this can help all of use using the same settings.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/noaa-oar-arl/UFS-Aerosol-Config/issues/14#issuecomment-922014747, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIUVNYGUOWFHY5EESUEXG3UCOF45ANCNFSM5BWN6ERQ .

zhanglikate commented 3 years ago

How about the ICS path that we discuss yesterday? We need to modify that ourself, right? Thanks.

Kate

On Sep 17, 2021, at 1:04 PM, Barry Baker @.***> wrote:

Yes you should be able to just change the date in the yaml file and run. We shouldn’t have to link anything.

On Fri, Sep 17, 2021 at 3:00 PM Kate.Zhang-NOAA @.***> wrote:

@bbakernoaa https://github.com/bbakernoaa May not occupy too much space. May save some time in the ESMF interpolation for output frequency. Thank you very much. I will just link to you rc file folder to created the workflow, this can help all of use using the same settings.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/noaa-oar-arl/UFS-Aerosol-Config/issues/14#issuecomment-922014747, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIUVNYGUOWFHY5EESUEXG3UCOF45ANCNFSM5BWN6ERQ .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/noaa-oar-arl/UFS-Aerosol-Config/issues/14#issuecomment-922016822, or unsubscribe https://github.com/notifications/unsubscribe-auth/APJPDRE3ZEWKUNCWOTITEMDUCOGK5ANCNFSM5BWN6ERQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

bbakernoaa commented 3 years ago

No. You should not have to do anything. I have already processed them and pointed to them in the yaml file

On Fri, Sep 17, 2021 at 3:07 PM Kate.Zhang-NOAA @.***> wrote:

How about the ICS path that we discuss yesterday? We need to modify that ourself, right? Thanks.

Kate

On Sep 17, 2021, at 1:04 PM, Barry Baker @.***> wrote:

Yes you should be able to just change the date in the yaml file and run. We shouldn’t have to link anything.

On Fri, Sep 17, 2021 at 3:00 PM Kate.Zhang-NOAA @.***> wrote:

@bbakernoaa https://github.com/bbakernoaa May not occupy too much space. May save some time in the ESMF interpolation for output frequency. Thank you very much. I will just link to you rc file folder to created the workflow, this can help all of use using the same settings.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/noaa-oar-arl/UFS-Aerosol-Config/issues/14#issuecomment-922014747 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AFIUVNYGUOWFHY5EESUEXG3UCOF45ANCNFSM5BWN6ERQ

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/noaa-oar-arl/UFS-Aerosol-Config/issues/14#issuecomment-922016822>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/APJPDRE3ZEWKUNCWOTITEMDUCOGK5ANCNFSM5BWN6ERQ . Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/noaa-oar-arl/UFS-Aerosol-Config/issues/14#issuecomment-922018422, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIUVN5SRYPULOAI4EPW5FDUCOGU3ANCNFSM5BWN6ERQ .

zhanglikate commented 3 years ago

@bbakernoaa @rmontuoro @lipan-NOAA I just remind something, which we may need to consider. The UPP part, currently , UPP use in the post does not include our chemical part. My questions is should we need to run the post (UPP part ), I saw it is there by default now.

bbakernoaa commented 3 years ago

Let’s not modify anything. Let’s keep what we have and go with it because this is part of the prototype configuration. Let’s not add more confusion.

On Fri, Sep 17, 2021 at 3:28 PM Kate.Zhang-NOAA @.***> wrote:

@bbakernoaa https://github.com/bbakernoaa @rmontuoro https://github.com/rmontuoro @lipan-NOAA https://github.com/lipan-NOAA I just remind something, which we may need to consider. The UPP part, currently , UPP use in the post does not include our chemical part. My questions is should we need to run the post (UPP part ), I saw it is there by default now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/noaa-oar-arl/UFS-Aerosol-Config/issues/14#issuecomment-922030384, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIUVNYLNZIHBYB5SHQ35ZDUCOJEBANCNFSM5BWN6ERQ .

zhanglikate commented 3 years ago

@bbakernoaa @lipan-NOAA Just in case they may need our chemical output to be included into the P7.1 UPP grib2 output, I have update the P7.1 UPP to include our chemical output. You can get the code from: https://github.com/zhanglikate/EMC_post/tree/ufs-aerosols . Rerun post (UPP) part is very very cheap, so either they need that or do not need that is not a big deal.

bbakernoaa commented 3 years ago

Kate, this should have been included in the workflow prior to this. Let’s not add any additional code changes at this late stage. We cannot continue to add more complexity. If this is a problem, I can do the runs to ensure all the runs are identical.

On Fri, Sep 17, 2021 at 4:10 PM Kate.Zhang-NOAA @.***> wrote:

@bbakernoaa https://github.com/bbakernoaa @lipan-NOAA https://github.com/lipan-NOAA Just in case they may need our chemical output to be included into the P7.1 UPP grib2 output, I have update the P7.1 UPP to include our chemical output. You can get the code from: https://github.com/zhanglikate/EMC_post/tree/ufs-aerosols . Rerun post (UPP) part is very very cheap, so either they need that or do not need that is not a big deal.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/noaa-oar-arl/UFS-Aerosol-Config/issues/14#issuecomment-922053215, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFIUVNYZRUYTQLBC263PFT3UCOODXANCNFSM5BWN6ERQ .

zhanglikate commented 3 years ago

@rmontuoro @bbakernoaa @lipan-NOAA I just finished my runs for 2013. While when I am trying to upload the all of the output to the HPSS, there are some issues. I think we need to figure out these before uploading the data: 1) There is no ocean output (nc file output related to ocean ) in our coupling run, I am not sure why? Is it correct that we did not see any ocean output? So none of the ocean post task is working, please see my log file at:

/scratch2/NCEPDEV/naqfc/Kate.Zhang/fv3gfs/comrot/CPL_C384_aero/logs/2013100100/gfs.post.ocnpost.p_000.log

2) I don’t have the permission in the HPSS path of /NCEPDEV/emc-naqfc/5year/, so I can not create any folder there. Also, I did not see the path of “/NCEPDEV/emc-naqfc/5year/role.ufscpara/HERA/prototype7.1/“ at the HPSS at Raffaele mentioned before. So I can no upload any output to there.

3) I have a command to share with you to upload the chemical output manually, that I did a test in my HPSS path. We can easily use that to upload the chem output later when the path of “/NCEPDEV/emc-naqfc/5year/role.ufscpara/HERA/prototype7.1/“ has been set up with permission.

 htar -P -cvf /ESRL/BMC/fim/1year/lzhang/gocart.inst_aod.tar gocart.inst_aod*

Thanks.