GEOS-ESM / GEOSgcm

GEOS Earth System Model GEOSgcm Fixture
Apache License 2.0
36 stars 13 forks source link

fvcore_layout.rc >> input.nml #34

Open wmputman opened 5 years ago

wmputman commented 5 years ago

In the gcm_run.j and the gcm_forecast.tmpl there are occasions when the fvcore_layout.rc does not properly cat to input.nml, this leaves input.nml as an empty file. The most prevalent symptom for this at the moment is that the FMS stack size does not get set properly and the model fails due to exceeding stack limits.

mathomp4 commented 5 years ago

@wmputman I cannot see how that cannot work. I mean, that's basic Linux 101 there.

I could easily belt-and-suspender it with file existence checks, using /bin/cat, checking status, etc., which I suppose we should do everywhere, but it's cat, which is pretty boring and doesn't do much.

Is this only happening on, say, SLES10? That system does have an older cat (v. 8.12) versus that on SLES12 (v. 8.25). I can't imagine cat having bugs, but I can check the changelogs for GNU and see.

mathomp4 commented 5 years ago

I looked at the Changelogs and the only mention of cat was in that for 8.17:

cp,mv,install,cat,split: now read and write a minimum of 64KiB at a time. This was previously 32KiB and increasing to 64KiB was seen to increase throughput by about 10% when reading cached files on 64 bit GNU/Linux.

That's not really a bug fix, though.

wmputman commented 5 years ago

This has happened on SLES11 (once) and SLES12 (frequently) over the last month.

mathomp4 commented 5 years ago

The only other thing I can see is in perhaps in FV3. For example, fv_control.F90 has this bit:

 697   │    f_unit=open_namelist_file()
 698   │    rewind (f_unit)
 699   │ ! Read Main namelist
 700   │    read (f_unit,fv_grid_nml,iostat=ios)
 701   │    ierr = check_nml_error(ios,'fv_grid_nml')
 702   │    call close_file(f_unit)

gfdl_cloud_microphys.F90 has:

3488   │         nlunit=open_namelist_file()
3489   │         rewind (nlunit)
3490   │      ! Read Main namelist
3491   │         read (nlunit,gfdl_cloud_microphysics_nml,iostat=ios)
3492   │         ierr = check_nml_error(ios,'gfdl_cloud_microphysics_nml')
3493   │         call close_file(nlunit)

Fairly similar. Open, rewind, read, check, close. However, later on in fv_control.F90 there is this:

 733   │       if (size(Atm) == 1) then
 734   │          f_unit = open_namelist_file()
 735   │       else if (n == 1) then
 736   │          f_unit = open_namelist_file('input.nml')
 737   │       else 
 738   │          write(nested_grid_filename,'(A10, I2.2, A4)') 'input_nest', n, '.nml'
 739   │          f_unit = open_namelist_file(nested_grid_filename)
 740   │       endif
 741   │ 
 742   │    ! Read FVCORE namelist 
 743   │       read (f_unit,fv_core_nml,iostat=ios)
 744   │       ierr = check_nml_error(ios,'fv_core_nml')
 745   │ 
 746   │    ! Read Test_Case namelist
 747   │       rewind (f_unit)
 748   │       read (f_unit,test_case_nml,iostat=ios)
 749   │       ierr = check_nml_error(ios,'test_case_nml')
 750   │       call close_file(f_unit)

This is about the only place I could find that there was an open_namelist_file() without a rewind() right after it. But, the check_nml_error() call should catch any iostat issues.

Now, fms itself seems to read namelist files a bit differently than FV3. From fms.F90:

 357   │     if (file_exist('input.nml')) then
 358   │        unit = open_namelist_file ( )
 359   │        ierr=1; do while (ierr /= 0)
 360   │           read  (unit, nml=fms_nml, iostat=io, end=10)
 361   │           ierr = check_nml_error(io,'fms_nml')  ! also initializes nml error codes
 362   │        enddo
 363   │  10    call mpp_close (unit)
 364   │     endif

but it's equivalent I think (close_file() is essentially a wrapper on mpp_close())

mathomp4 commented 5 years ago

I am trying one thing now. Per Rusty in the source:

1296   │ !-----------------------------------------------------------------------
1297   │ ! subroutine READ_INPUT_NML
1298   │ !
1299   │ !
1300   │ ! Reads an existing input.nml into a character array and broadcasts
1301   │ ! it to the non-root mpi-tasks. This allows the use of reads from an
1302   │ ! internal file for namelist settings (requires 2003 compliant compiler)  
1303   │ !
1304   │ ! read(input_nml_file, nml=<name_nml>, iostat=status)

This seems to be triggered by the INTERNAL_FILE_NML codepath. It would save file closes and opens if it works.

I'm building with the appropriate macro set and we'll see if it can even run.

mathomp4 commented 5 years ago

Or wait, maybe not. FMS code is fun to read...

mathomp4 commented 5 years ago

Well, that wasn't too hard. I can definitely activate the INTERNAL_FILE_NML path. You have to change a few CMakeLists.txt and edit the two microphysics files, but it seems zero-diff. Whether or not it helps is another ball of wax, but I suppose between @wmputman or @sdrabenh if one of you seems to more consistently hit the error, it could be something to try.

NOTE: I didn't change the compilation of MOM5 which of course would need the same ifdef activated, but baby steps first.

mathomp4 commented 5 years ago

@wmputman et al, I sent an email to Rusty about INTERNAL_FILE_NML:

My question is regarding namelist reading in FMS. Bill Putman seems to be having intermittent issues with it, so I took a look. I noticed your name in the INTERNAL_FILE_NML codepath.

My reading is that instead of say, all 96 processors reading the namelist every time it's processed in FMS, FV3, microphysics, etc., only root would read it and then broadcast the results to an internal file. Is that correct?

He replied with:

Your interpretation is correct. To use the internal file, you only need:

use mpp_mod,   only:  input_nml_file

read (input_nml_file, <namelist>, iostat=io)
ierr = check_nml_error (io, '<namelist>')

If you are using multiple namelist files, you can clear and re-read a new namelist as needed using the mpp_mod::read_input_nml subroutine.

As you run on many, many processors, @wmputman, it might be worth moving to INTERNAL_FILE_NML for some testing. It might cost some MPI time to broadcast the character array, but it's probably better than 1000s of processes all opening the same file.