Closed yantosca closed 5 years ago
Thanks for this deep diagnosis.
If the core issue is
It seems that the issue is happening somewhere in MAPL, as reading in the Olson data uses a new feature of MAPL to return the fraction of the grid box that is covered by land type N, where N is an integer.
Some dirty fixes to correct this:
MAPL_HorzTransOrderFraction
This would be like the last resort. It is indeed best to make gfortran working properly.
I am going to do a benchmark run with fortran8.2 on Odyssey in January. There may be other problems beyond this one that we don't know about since all GCHP benchmarks have been with ifort.
I poked around in MAPL_RegridConservative Run a bit. There is this code:
FF = 0.0
do N = 1, size(TT%OUT%II)
II = TT%IN%II(N)
JI = TT%IN%JJ(N)
!@ if (II<1 .or. II>size(INPUT,1) .or.JI<1 .or. JI>size(INPUT,2)) then
!@ print *,'we have a problem'
!@ endif
VALI = INPUT(II,JI)
if(VALI/=MAPL_UNDEF) then
IO = TT%OUT%II(N)
JO = TT%OUT%JJ(N)
W = TT%OUT%W(N)
if (uSAMPLE) then
if (W > FF(IO,JO)) then
OUTPUT(IO,JO) = VALI
FF (IO,JO) = W
end if
%%% THIS IS WHERE THE FRACTION OF LANDTYPE PER GRIDBOX IS COMPUTED
%%% GETFRAC= the desired land type value (0..72)
%%% VALI = the land type read in from the file
%%% OUTPUT = the fraction of grid box with land type GETFRAC
%%% FF = contains the sum of the mapping weights on the ouptut grid
else if (doFrac) then
if (VALI .gt. GetFrac_-eps .and. VALI .lt. GetFrac_+eps) then
OUTPUT(IO,JO) = OUTPUT(IO,JO)+W
end if
FF(IO,JO)=FF(IO,JO)+W
else
OUTPUT(IO,JO) = OUTPUT(IO,JO) + W * VALI
FF (IO,JO) = FF (IO,JO) + W
end if
endif
end do
if(.not. uSAMPLE) then
where(FF /= 0.0)
OUTPUT = OUTPUT / FF
elsewhere
OUTPUT = MAPL_Undef
end where
end if
But this code seems to be working OK. I put in some debug print and as far as I can tell the output array is as expected.
The only thing I would do differently would be to replace the
WHERE( FF /= 0.0 )
with
WHERE( FF > 0.0 .or. FF < 0.0 )
since sometimes an equality test for zero will fail due to roundoff. But the values of FF always seem to be close to 1, so I don't think it's an issue. The error may be upstream from there.
Maybe in January we can forward this to the MAPL dev team.
I am going to do a benchmark run with fortran8.2 on Odyssey in January.
@lizziel Consider checking the 4 cases: 1) ifort standard 2) gfortran standard 3) ifort with DryDep turned off (to remove landmap effect) 4) gfortran with DryDep turned off
Looks at surface ozone. (3) and (4) should be very close if there are no extra bugs besides this landmap one.
Reading land type binary masks instead of the land map should bypass this issue. I generated a new file and confirmed that the two methods give identical results when using ifort. Tests with gfortran are in progress.
Using an alternative regridding method does not correct this issue. Further testing is in progress.
I have created a better error trap (one that only requires modifications to GCHP/Chem_GridCompMod.F90). We now get this error message if the Olson values are all zeroes:
### Reading OLSON01 from imports
... etc ...
### Reading OLSON72 from imports
State_Met%LandTypeFrac contains all zeroes! This error is a known issue in MAPL
when using gfortran. This should not happen if you compiled with ifort.
GIGCchem::Run_ 1857
GIGCchem::Run2 1277
GCHP::Run 420
MAPL_Cap 792
Investigation of MAPL CFIO shows that the reading and regridding of the Olson land map file is correct for all regridding methods. The issue thus occurs somewhere between reading/regridding and retrieving the pointer to the import.
Until this issue is fixed we recommend not using gfortran with GCHP. The documentation is updated for this but there is also now a forced stop if the error occurs (see above comment from Bob).
This issue will not be fixed in GCHP 12.2.0. I will post here again when a fix is found and slated for an upcoming version.
GMAO tested our Olson land map file with their current integration branch of GEOS using gfortran. They did not encounter the problem we have using the old MAPL in GCHP. Updating GCHP to use a new version of MAPL, even newer than GMAO tested with, is in progress. I am currently working under the assumption that it will resolve this bug but it is too soon to verify this.
@lizziel Thanks for the update! That's great to know and seems like new MAPL will solve a lot of crucial problems (gfortran bug, inefficient I/O).
I have confirmed that implementation of the new MAPL version in GCHP resolves this issue. The land mask imports are no longer all zero. It is still uncertain which version of GCHP the new MAPL will go into but I will announce it here once it is decided.
I have confirmed that implementation of the new MAPL version in GCHP resolves this issue.
This is wonderful news! We should finally switch to the GNU stack for benchmarks to see if there are still undiscovered issues like this.
I am closing this issue since it is fixed with MAPL update going into 12.5.
Long story short, the Olson land map data seems to be coming in as all zeroes from the Import State for GCHP simulations that use gfortran. This issue is most certainly the root cause of previously-mentioned issues #13 and #14.
My GC-classic and GCHP code were at the following commits:
Note that I modified the code to put in an error trap that will exit if all elements of State_Met%LandTypeFrac are zero (or more precisely, when the variable maxFracInd is zero).
I have narrowed down the issue to this code section of GCHP/Chem_GridCompMod.F90, where the Olson land map data is obtained from the Import State and copied into State_Met%LandFracType.
Also not shown above are some debug print statements.
I compiled and ran a C24 GCHP Rn-Pb-Be simulation with gfortran 8.2, using 6 cores of Odyssey. The modules were:
And I got the following output in the gchp.log file:
The error message is from the new error trap that I committed to the bugfix/GCHP_issues branches of the gchp and geos-chem repos on Github. The numbers in each line beginning with "%%%LTF" indicate the core number, I & J value, and the sum of State_Met%LandTypeFrac for that I,J and Olson type. As you can see all of the Olson values are coming into State_Met%LandTypeFrac as zeroes.
It seems that the issue is happening somewhere in MAPL, as reading in the Olson data uses a new feature of MAPL to return the fraction of the grid box that is covered by land type N, where N is an integer.
In MAPL_ExtDataGridCompMod.F90, then there is this snippet, where there are calls down to MAPL_CFIORead
The relevant calls are the ones where trans == MAPL_HorzTransOrderFraction. In MAPL_CFIO, there are further calls to MAPL_HorzTransformRun, which is where I suspect the error may be happening. MAPL_HorzTransformRun is an overloaded interface for several other module procedures
Unfortunately, at this point my knowledge of the innards of MAPL is not very comprehensive. If anyone has any other suggestions to try, then please let me know. My guess is that deep into MAPL there is some code that gfortran isn't parsing properly, or for which an unexpected side-effect is occurring.
NOTE: This could potentially be caused by the ESMF version which is 5.2. ESMF 5.2 pre-dates the newest versions of gfortran, so there could conceivably be some incompatibility. But who knows.
THE UPSHOT: Until we find & fix this issue, we should not use gfortran for GCHP simulations. While GCHP can run on the AWS cloud in tutorial mode, the error is still present and you will get erroneous output.
I verifiied that compiling and running GCHP using ifort 17 correctly read the Olson land map values from the Import State into State_Met%LandFracType. So this issue only happens with GNU Fortran.
Also, I will mark #13 and #14 as closed, as this issue is the root cause.