E3SM-Project / scream

Fork of E3SM used to develop exascale global atmosphere model written in C++
https://e3sm-project.github.io/scream/
Other
80 stars 55 forks source link

Incompatible domain grid coordinate with bi-grid ne1024pg2 #1140

Open wlin7 opened 3 years ago

wlin7 commented 3 years ago

A run using grid ne1024pg2_oRRS19to6v3 (available in branch wlin/atm/ne1024pg2_oRRS18to6v3, PR #1131 failed with

 ERROR: (seq_domain_check_grid)  incompatible domain grid coordinates
12273: seq_domain_check_grid - n:482 d1:  231.582371 d2:  231.582371 diff:  0.00000000000142 eps:  0.00000000000100

The grids seen by atm and lnd domains have differences from 12th decimal point in lat/lon. Below is backtracing of the code flow from where it was aborted.

shr_abort_mod_mp_         114  shr_abort_mod.F90
seq_domain_mct_mp         693  seq_domain_mct.F90
seq_domain_mct_mp         378  seq_domain_mct.F90
cime_comp_mod_mp_        1989  cime_comp_mod.F90

The differences arise from the two calls in the following if-block

if (atm_present .and. lnd_present .and. samegrid_al) then
       if (iamroot) write(logunit,F00) ' --- checking atm/land domains ---'
       call seq_domain_check_grid(atmdom_a%data, lnddom_a%data, 'lat' , eps=eps_axgrid, mpicom=mpicom_cplid, mask=maskl)
       call seq_domain_check_grid(atmdom_a%data, lnddom_a%data, 'lon' , eps=eps_axgrid, mpicom=mpicom_cplid, mask=maskl)
endif

It is likely because different versions of grid and/or map files are used during runtime or for generating certain input files. It would be helpful if we know which files are involved to feed the domain/grid data to the above atmdom_a%data and lnddom_a%data.

jonbob commented 3 years ago

@wlin7 - you could try changing the tolerance in env_run, since it is so close to passing. I think that setting is:

    <entry id="EPS_AGRID" value="1.0e-12">
      <type>real</type>
      <desc>Error tolerance for differences in atm/land lat/lon in domain checking</desc>
    </entry>

and it looks like you just need to bump it up slightly

wlin7 commented 3 years ago

Thanks, @jonbob . Useful to relax the tolerance to see how far this first test with the grid can go. The diff shown above is just an example of one cell. The max difference can be much larger. When EPS_AGRID=1e-12, it captured diff as large as 9e-12. When increase EPS to 2e-11, it captured as large as 8e-11. Tentatively bump it to a 1e-7, no larger difference is detected and the model can run. Further tests to scale it back after the run with 1e-7 is completed.

brhillman commented 3 years ago

I've ran into this before, I think the cause was using a different algorithm in one of the mapping files that was used to generate the domain files. I believe it's the atmosphere to ocean or ocean to atmosphere flux maps that are relevant here, so you might double check that the versions of those you're specifying in config_grids.xml are identical to those used when generating the domain files.

wlin7 commented 3 years ago

It can run with EPS_AGRID=1e-10.

AaronDonahue commented 1 year ago

@wlin7 , has this been addressed?

AaronDonahue commented 4 weeks ago

@brhillman , do you know if this has been addressed?