ESMCI / cmeps-cime

This is a "fork" of the cime repository that has the development version of the nuopc CMEPS driver and mediator
Other
0 stars 0 forks source link

Jedwards/med profile #51

Closed jedwards4b closed 5 years ago

jedwards4b commented 6 years ago

Adds a med profile phase to print time and memory info to the med log Test suite: hand testing Test baseline: Test namelist changes: Test status: bit for bit

Fixes

User interface changes?:

Update gh-pages html (Y/N)?:

Code review:

rsdunlapiv commented 6 years ago

@jedwards4b is working on making sure the med_phase_profile is not skipped in the run sequence:

Calling /gpfs/u/home/dunlap/UFSCOMP.oct17/cime/src/drivers/nuopc/cime_config/buildnml
Writing nuopc_runseq will skip components ['ROF', 'GLC', 'WAV', 'ESP']
Writing nuopc_runseq, skipping             MED med_phases_prep_wav
Writing nuopc_runseq, skipping             MED med_connectors_prep_med2wav
Writing nuopc_runseq, skipping             MED -> WAV :remapMethod=redist
Writing nuopc_runseq, skipping             MED med_phases_prep_rof
Writing nuopc_runseq, skipping             MED med_connectors_prep_med2rof
Writing nuopc_runseq, skipping             MED -> ROF :remapMethod=redist
Writing nuopc_runseq, skipping             ROF
Writing nuopc_runseq, skipping      WAV
Writing nuopc_runseq, skipping             ROF -> MED :remapMethod=redist
Writing nuopc_runseq, skipping             MED med_connectors_post_rof2med
Writing nuopc_runseq, skipping             MED med_phases_prep_glc
Writing nuopc_runseq, skipping             MED med_connectors_prep_med2glc
Writing nuopc_runseq, skipping             MED -> GLC :remapMethod=redist
Writing nuopc_runseq, skipping      GLC
Writing nuopc_runseq, skipping             WAV -> MED :remapMethod=redist
Writing nuopc_runseq, skipping             MED med_connectors_post_wav2med
Writing nuopc_runseq, skipping             GLC -> MED :remapMethod=redist
Writing nuopc_runseq, skipping             MED med_connectors_post_glc2med
Writing nuopc_runseq, skipping      MED med_phases_profile
jedwards4b commented 6 years ago

@mvertens you should be aware of this issue - because the word profile contains rof it was being deleted from the runseq. I changed the logic to look for _rof or 2rof instead of just rof.

rsdunlapiv commented 5 years ago

Failures to look into:

dunlap@cheyenne2:/glade/scratch/dunlap> ./cs.status.20181107_152500_li9sod | grep "FAIL " | grep -v NLCOMP
    FAIL ERR_Vmct_Ld5.f19_g16.BMOM.cheyenne_intel.allactive-nuopc_cap_io COMPARE_base_rest
    FAIL ERR_Vnuopc_Ld5.f19_g16.BMOM.cheyenne_intel.allactive-nuopc_cap_io RUN time=30
    FAIL ERS_Vmct_Ld5.T62_g16.CMOM.cheyenne_intel.mom-nuopc_cap COMPARE_base_rest
    FAIL ERS_Vmct_Ld5.T62_g16.GMOM.cheyenne_intel.mom-nuopc_cap COMPARE_base_rest
    FAIL ERS_Vmct_Ln9.f19_g17_rx1.A.cheyenne_intel BASELINE oct17: ERROR BFAIL baseline directory '/glade/scratch/dunlap/BASELINES/oct17/ERS_Vmct_Ln9.f19_g17_rx1.A.cheyenne_intel' does not exist
    FAIL ERS_Vnuopc_Ld5.T62_g16.GMOM.cheyenne_intel.mom-nuopc_cap BASELINE exception
    FAIL ERS_Vnuopc_Ln5.f19_g16.F2000Nuopc.cheyenne_intel.cam-nuopc_cap RUN time=26
    FAIL ERS_Vnuopc_Ln5.f45_f45_mg37.I2000Clm50SpNuopc.cheyenne_intel.clm-nuopc_cap RUN time=28
    FAIL ERS_Vnuopc_Ln9_N3.f19_g17_rx1.A.cheyenne_intel BASELINE oct17: ERROR BFAIL baseline directory '/glade/scratch/dunlap/BASELINES/oct17/ERS_Vnuopc_Ln9_N3.f19_g17_rx1.A.cheyenne_intel' does not exist
    FAIL SMS_Vmct_Ld1.f19_g17_rx1.A.cheyenne_intel BASELINE oct17: ERROR BFAIL baseline directory '/glade/scratch/dunlap/BASELINES/oct17/SMS_Vmct_Ld1.f19_g17_rx1.A.cheyenne_intel' does not exist
    FAIL SMS_Vmct_Ld5.T62_g37.DTEST.cheyenne_intel.cice-nuopc_cap MEMCOMP Error: Memory usage increase > 10% from baseline
    FAIL SMS_Vnuopc.f19_g16.X.cheyenne_intel MEMCOMP exception
    FAIL SMS_Vnuopc_Ld1_N3.f19_g17_rx1.A.cheyenne_intel BASELINE oct17: ERROR BFAIL baseline directory '/glade/scratch/dunlap/BASELINES/oct17/SMS_Vnuopc_Ld1_N3.f19_g17_rx1.A.cheyenne_intel' does not exist
    FAIL SMS_Vnuopc_Ld5.f19_g16.BMOM.cheyenne_intel.allactive-nuopc_cap_io RUN time=29
    FAIL SMS_Vnuopc_Ld5.T62_g16.CMOM.cheyenne_intel.mom-nuopc_cap MEMCOMP exception
    FAIL SMS_Vnuopc_Ld5.T62_g16.GMOM.cheyenne_intel.mom-nuopc_cap MEMCOMP exception
    FAIL SMS_Vnuopc_Ld5.T62_g37.DTEST.cheyenne_intel.cice-nuopc_cap MEMCOMP exception
    FAIL SMS_Vnuopc_Ln5.f19_g16.F2000Nuopc.cheyenne_intel.cam-nuopc_cap RUN time=26
    FAIL SMS_Vnuopc_Ln5.f45_f45_mg37.I2000Clm50SpNuopc.cheyenne_intel.clm-nuopc_cap RUN time=28

and in particular from: SMS_Vnuopc_Ln5.f19_g16.F2000Nuopc.cheyenne_intel.cam-nuopc_cap.C.20181107_152500_li9sod

20181107 161447.410 ERROR            PET00 src/addon/NUOPC/src/NUOPC_Comp.F90:524   Invalid argument  - Attribute not present
20181107 161447.411 ERROR            PET00 lnd_comp_nuopc.F90:492   Invalid argument  - Passing error in return code
20181107 161447.411 ERROR            PET00 ESM0001:src/addon/NUOPC/src/NUOPC_Driver.F90:1765   Invalid argument  - Phase 'IPDv01p3' Initialize for modelComp 4: LND did not return ESMF_SUCCESS
20181107 161447.411 ERROR            PET00 ESM0001:src/addon/NUOPC/src/NUOPC_Driver.F90:1255   Invalid argument  - Passing error in return code
20181107 161447.411 ERROR            PET00 ensemble:src/addon/NUOPC/src/NUOPC_Driver.F90:1765   Invalid argument  - Phase 'IPDv02p3' Initialize for modelComp 1: ESM0001 did not return ESMF_SUCCESS
20181107 161447.411 ERROR            PET00 ensemble:src/addon/NUOPC/src/NUOPC_Driver.F90:1259   Invalid argument  - Passing error in return code
20181107 161447.411 ERROR            PET00 ensemble:src/addon/NUOPC/src/NUOPC_Driver.F90:333   Invalid argument  - Passing error in return code
20181107 161447.411 ERROR            PET00 esmApp.F90:87   Invalid argument  - Passing error in return code

@jedwards4b can you take a look at this error above?

rsdunlapiv commented 5 years ago

Tests running:

dunlap@cheyenne2:~/UFSCOMP.nov13/cime/scripts> qcmd -l walltime=04:00:00 -- ./create_test --xml-machine cheyenne --xml-testlist testlist_cmeps.xml --baseline-root /glade/scratch/dunlap/BASELINES --compare nov13
rsdunlapiv commented 5 years ago

Tests pass except for some new baselines that need to be generated.

dunlap@cheyenne2:/glade/scratch/dunlap> ./cs.status.20181113_155149_rjf8yf | grep "FAIL " | grep -v NLCOMP
    FAIL ERR_Vmct_Ld5.f19_g16.BMOM.cheyenne_intel.allactive-nuopc_cap_io COMPARE_base_rest
    FAIL ERS_Vmct_Ld5.T62_g16.CMOM.cheyenne_intel.mom-nuopc_cap COMPARE_base_rest
    FAIL ERS_Vmct_Ld5.T62_g16.GMOM.cheyenne_intel.mom-nuopc_cap COMPARE_base_rest
    FAIL ERS_Vmct_Ln9.f19_g17_rx1.A.cheyenne_intel BASELINE nov13: ERROR BFAIL baseline directory '/glade/scratch/dunlap/BASELINES/nov13/ERS_Vmct_Ln9.f19_g17_rx1.A.cheyenne_intel' does not exist
    FAIL ERS_Vnuopc_Ln9_N3.f19_g17_rx1.A.cheyenne_intel BASELINE nov13: ERROR BFAIL baseline directory '/glade/scratch/dunlap/BASELINES/nov13/ERS_Vnuopc_Ln9_N3.f19_g17_rx1.A.cheyenne_intel' does not exist
    FAIL SMS_Vmct_Ld1.f19_g17_rx1.A.cheyenne_intel BASELINE nov13: ERROR BFAIL baseline directory '/glade/scratch/dunlap/BASELINES/nov13/SMS_Vmct_Ld1.f19_g17_rx1.A.cheyenne_intel' does not exist
    FAIL SMS_Vmct_Ld5.T62_g16.CMOM.cheyenne_intel.mom-nuopc_cap MEMCOMP Error: Memory usage increase > 10% from baseline
    FAIL SMS_Vnuopc.f19_g17.X.cheyenne_intel MEMCOMP exception
    FAIL SMS_Vnuopc_Ld1_N3.f19_g17_rx1.A.cheyenne_intel BASELINE nov13: ERROR BFAIL baseline directory '/glade/scratch/dunlap/BASELINES/nov13/SMS_Vnuopc_Ld1_N3.f19_g17_rx1.A.cheyenne_intel' does not exist
    FAIL SMS_Vnuopc_Ld5.f19_g16.BMOM.cheyenne_intel.allactive-nuopc_cap_io MEMCOMP exception
    FAIL SMS_Vnuopc_Ld5.T62_g16.CMOM.cheyenne_intel.mom-nuopc_cap MEMCOMP exception
    FAIL SMS_Vnuopc_Ld5.T62_g16.GMOM.cheyenne_intel.mom-nuopc_cap MEMCOMP exception
    FAIL SMS_Vnuopc_Ld5.T62_g37.DTEST.cheyenne_intel.cice-nuopc_cap MEMCOMP exception