geoschem / GCHP

The "superproject" wrapper repository for GCHP, the high-performance instance of the GEOS-Chem chemical-transport model.
https://gchp.readthedocs.io
Other
22 stars 25 forks source link

Meteorology for GCHP advection #342

Open lizziel opened 1 year ago

lizziel commented 1 year ago

Name and Institution

Name: Lizzie Lundgren Institution: Harvard University

New GCHP feature or discussion

This issue is to discuss current work related to meteorology used in GCHP advection. There are several things that I hope to get into version 14.3.0.

  1. Validation of GCHP runs using hourly mass fluxes. All official benchmarks use 3-hourly winds instead. Hourly mass fluxes are available for GEOS-FP at C720 (limited time range) and GEOS-IT at C180 (met option to be available in 14.3.0). Mass fluxes are not available for MERRA2.
  2. Implement mass flux regridding update in MAPL. This update from @sdeastham is currently a MAPL PR pending review. The same update needs to be put into our MAPL fork which is an older version of MAPL than what the PR is based on.
  3. Document resource constraints when using mass fluxes. See https://github.com/GEOS-ESM/MAPL/issues/2118.
  4. The algorithm for computing dry pressure level edges for advection in the GCHPctmEnv gridded component needs an overhaul. We currently (1) sum moisture-corrected total pressure delta across all levels to get surface dry pressure and then (2) construct the 3D dry pressures from the surface dry pressure using Ap and Bp. This method should be compared with a direct computation of 3D dry pressure from 3D total pressure (no reconstruction from surface pressure using Ap/Bp).
  5. Add pressure diagnostics in advection. These will appear in HISTORY.rc for gridded component DYNAMICS instead of GCHPchem.
  6. Add budget transport diagnostics and/or vertical flux diagnostics per species.

Pinging @sdeastham and @1Dandan who will help with this work.

lizziel commented 6 months ago

To clarify, we would expect numerical noise differences for different compilers, e.g. intel versus GNU. But there should not be systematic bias and diffs should be very small.

lizziel commented 6 months ago

I should also note that there is a known small memory leak in GCHP that seems to be from MAPL. I created an issue a couple years ago on this at https://github.com/GEOS-ESM/MAPL/issues/1793. It is small enough that it has not been addressed yet.

lizziel commented 6 months ago

@yuanjianz, could you put your raw wind output into subdirectories OutputDir and Restarts? I need the restart files as well as the diagnostics. Thanks!

yuanjianz commented 6 months ago

@lizziel, sure. They are now there. It is now going to November, so I assume raw wind would finish today and raw mass flux probably tomorrow or Thursday.

For the environment, it is a little bit hard to expand the full environment. Because we are using Docker+spack and did not generate module files for all dependencies.

The memory leak GNU(ubuntu20.01):

spack find --loaded
-- linux-ubuntu20.04-skylake_avx512 / gcc@9.4.0 -----------------
gcc@10.2.0

-- linux-ubuntu20.04-skylake_avx512 / gcc@10.2.0 ----------------
cmake@3.26.3  esmf@8.4.2  gettext@0.22.3  hdf5@1.14.3  netcdf-c@4.9.2  netcdf-fortran@4.5.3  openmpi@4.1.1
==> 8 loaded packages
---
spack find
-- linux-ubuntu20.04-skylake_avx512 / gcc@9.4.0 -----------------
autoconf@2.69                bzip2@1.0.8    gdbm@1.23       libiconv@1.17    m4@1.4.18    perl@5.38.0    texinfo@7.0.3
autoconf-archive@2023.02.20  diffutils@3.7  gettext@0.22.3  libsigsegv@2.14  mpc@1.3.1    pkgconf@1.9.5  xz@5.4.1
automake@1.16.5              gawk@5.2.2     gmake@4.2.1     libtool@2.4.7    mpfr@4.2.0   readline@8.2   zlib-ng@2.1.4
berkeley-db@18.1.40          gcc@10.2.0     gmp@6.2.1       libxml2@2.10.3   ncurses@6.4  tar@1.30       zstd@1.5.5

-- linux-ubuntu20.04-skylake_avx512 / gcc@10.2.0 ----------------
autoconf@2.69                       flex@2.6.3        libnl@3.3.0           openssh@8.2p1         snappy@1.1.10
automake@1.16.5                     gdbm@1.23         libpciaccess@0.17     openssl@3.1.3         sqlite@3.43.2
berkeley-db@18.1.40                 gettext@0.22.3    libtool@2.4.7         parallelio@2.6.2      tar@1.30
bison@3.8.2                         gmake@4.2.1       libxcrypt@4.4.35      perl@5.38.0           ucx@1.14.1
bzip2@1.0.8                         hdf5@1.14.3       libxml2@2.10.3        pkgconf@1.9.5         util-linux-uuid@2.38.1
c-blosc@1.21.5                      hwloc@2.9.1       lz4@1.9.4             pmix@4.2.2            util-macros@1.19.3
ca-certificates-mozilla@2023-05-30  libaec@1.0.6      m4@1.4.18             py-docutils@0.20.1    xz@5.4.1
cmake@3.26.3                        libbsd@0.11.7     ncurses@6.4           py-pip@23.1.2         zlib-ng@2.1.4
curl@8.4.0                          libevent@2.1.12   netcdf-c@4.9.2        py-setuptools@68.0.0  zstd@1.5.5
diffutils@3.7                       libfabric@1.14.0  netcdf-fortran@4.5.3  py-wheel@0.41.2
esmf@8.4.2                          libffi@3.4.4      nghttp2@1.57.0        python@3.11.6
expat@2.5.0                         libiconv@1.17     numactl@2.0.14        rdma-core@41.0
findutils@4.7.0                     libmd@1.0.4       openmpi@4.1.1         readline@8.2
==> 89 installed packages

The working intel environment(centos7):

spack find --loaded
==> 8 loaded packages
-- linux-centos7-skylake_avx512 / intel@2020 --------------------
cmake@3.17.5  hdf5@1.12.2  intel-mpi@2020  m4@1.4.16  netcdf-c@4.8.1  netcdf-fortran@4.5.4  pkgconf@0.27.1  zlib@1.2.12
---
spack find
==> 17 installed packages
-- linux-centos7-skylake_avx512 / intel@2020 --------------------
antlr@2.7.7   expat@2.4.8  hdf5@1.12.2     libmd@1.0.4  netcdf-c@4.8.1        udunits@2.2.28
bison@3.0.4   flex@2.5.37  intel-mpi@2020  m4@1.4.16    netcdf-fortran@4.5.4  zlib@1.2.12
cmake@3.17.5  gsl@2.7.1    libbsd@0.11.5   nco@4.9.3    pkgconf@0.27.1

To note that the GNU environment used spack find external to document some dependencies while Intel environment did not. Also, the Intel environment is installed with Mellanox OFED for MPI while GNU uses libfabric. If you are more interested in the detailed setup, the GNU environment is the official docker image maintained by @yidant with a little bit modification(+hl and + fortran for hdf5 and netcdf-c). @1Dandan's intel docker is built from Compute1-supported base.

I will try to install OFED in the GNU environment to see if it would fix the problem in the future if I have time. GNU.txt

1Dandan commented 6 months ago

To add for the intel environment, the ESMF version is v8.3.1.

lizziel commented 6 months ago

Not sure if this is related, but there was a bug report from former GCST member Will Downs about a bug registering memory in GCHP when using libfabric. https://github.com/geoschem/GCHP/issues/47

yuanjianz commented 6 months ago

Hi @lizziel, the raw wind 1 yr geosfp transport tracer is ready: http://geoschemdata.wustl.edu/ExternalShare/tt-geosfp-c24-raw-wind/

I noticed that by changing to Intel environment, although memory leak disappears and running fast enough at first, the simulation slows down to half speed comparing to Dandan's last time runs. From the time diagnostics, Bracket in ExtData takes most of the time. Not sure about why this is happening.

lizziel commented 6 months ago

Hi @yuanjianz, I am looking at the results and the diagnostics look off, for both of our runs. The passive tracer restarts compare well, with difference of 1e-6, but I think the diagnostic is getting corrupted. This may explain the slow-down.

lizziel commented 6 months ago

Stangely I cannot reproduce the issue. I am doing another run with the new diagnostics turned off.

lizziel commented 6 months ago

I ran a 1-month simulation using 14.2.2 and 14.3.1 for GEOS-FP processed files and get identical results except for st80_25 (as expected). I do not see a slow-down. I am trying to make sure that a constant value for every grid box in the monthly mean of passive tracer concentrations makes sense. We do see this in version 14.2.2. I am skeptical given the values in the internal state.

lizziel commented 6 months ago

Separate from this issue of constant values for passive tracer, I do see that the raw versus processed bug is fixed.

yuanjianz commented 6 months ago

Hi @lizziel, thanks for the update. You said you found corrupted diagnostics in your run as well. Do you think it was the new diagnostics that caused the performance degradation on my end? And it seems only happening for raw files as well, because my GEOS-IT preprocessed wind fullchem benchmark using 14.3.1 with the new diagnostics GCHPctmLevel* did not show performance issues.

lizziel commented 6 months ago

Hi @yuanjianz, we expect the run with the raw files to perform not as well as using preprocessed files because there are so many files to read and with high frequency. Do you see the same performance issue using 14.3.0 instead of 14.3.1?

yuanjianz commented 6 months ago

Hi @lizziel, thanks for the explanation. I haven't done the one with 14.3.0. I am just curious about the diagnostic corruption you mentioned above. What does it mean? Do you think I should turn off the new diagnostics in 14.3.1 and then rerun a performance test between the two versions?

lizziel commented 6 months ago

See https://github.com/geoschem/GCHP/issues/399 for discussion of the suspected Passive Tracer diagnostic issue. I am not going to worry about it much for now since it does not impact mass conservation tests (those use restart files) and is not recently introduced.

lizziel commented 6 months ago

Here is the global mass table for passive tracer for @yuanjianz's 2022 GEOS-FP run with raw GMAO fields and using winds in advection:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  Global Mass of Passive Tracer in 14.3.1_GEOS-FP_raw_wind
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 Date        Mass [Tg]
 ----------  ----------------
 2022-01-01   17.6562799006358
 2022-02-01   17.6527063054427
 2022-03-01   17.6527047860219
 2022-04-01   17.6527058698120
 2022-05-01   17.6527059098902
 2022-06-01   17.6527058070735
 2022-07-01   17.6527057721245
 2022-08-01   17.6527057131759
 2022-09-01   17.6527056042564
 2022-10-01   17.6527059860906
 2022-11-01   17.6527059325285
 2022-12-01   17.6527059325285

 Summary
 ------------------------------
 Max mass =   17.6562799006358 Tg
 Min mass =   17.6527047860219 Tg
 Abs diff =    3575114613.909 g
 Pct diff =      0.0202525033 %

NOTE: The last month was not available so I copied Nov.

For comparison, here are results for the same run using procesessed winds. Note that both of these runs use dry pressure in advection.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  Global Mass of Passive Tracer in 14.3.1_GEOS-FP_processed_wind
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 Date        Mass [Tg]
 ----------  ----------------
 2022-01-01   17.6562799006358
 2022-02-01   17.6527063301804
 2022-03-01   17.6527047587778
 2022-04-01   17.6527058170920
 2022-05-01   17.6527058567409
 2022-06-01   17.6527058000323
 2022-07-01   17.6527057656179
 2022-08-01   17.6527056823411
 2022-09-01   17.6527056193937
 2022-10-01   17.6527059680841
 2022-11-01   17.6527059053356
 2022-12-01   17.6527059056348

 Summary
 ------------------------------
 Max mass =   17.6562799006358 Tg
 Min mass =   17.6527047587778 Tg
 Abs diff =    3575141858.008 g
 Pct diff =      0.0202526576 %
yuanjianz commented 5 months ago

Hi @lizziel, the GEOS-FP raw mass flux run is ready now. Please check the link here: http://geoschemdata.wustl.edu/ExternalShare/tt-geosfp-raw-csmf/

lizziel commented 5 months ago

Thanks @yuanjianz. Here is the mass conservation table for your mass flux run:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  Global Mass of Passive Tracer in 14.3.1_GEOS-FP_raw_mass_fluxes
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 Date        Mass [Tg]
 ----------  ----------------
 2022-01-01   17.6562799006358
 2022-02-01   17.6527118519604
 2022-03-01   17.6527119102029
 2022-04-01   17.6527118174852
 2022-05-01   17.6527117827595
 2022-06-01   17.6527118234161
 2022-07-01   17.6527118121272
 2022-08-01   17.6527117025911
 2022-09-01   17.6527117019211
 2022-10-01   17.6527120035089
 2022-11-01   17.6527118178688
 2022-12-01   17.6527118540502

 Summary
 ------------------------------
 Max mass =   17.6562799006358 Tg
 Min mass =   17.6527117019211 Tg
 Abs diff =    3568198714.657 g
 Pct diff =      0.0202133178 %

Looks like the mass conservation issue with mass fluxes is fixed with the raw GMAO fields bug fix.

yuanjianz commented 3 days ago

Hi @lizziel @sdeastham, my recent mass flux fullchem benchmark is showing unreasonbale high surface aerosol concentration than wind.

Looking back at Lizzie's previous GEOS-IT C180 mass flux v.s. wind transport tracer simulation, I found it seems to be due to much weaker advection in mass flux runs. Taking SF6 and Rn222 as examples(plots from Lizzie's comparison above):

image

*plots are annual mean massflux - wind or massflux/wind

My instinct is that only a shift from wind to mass flux won't have such a large effect. And as I can recall Martin et al, GMD, 2022, GCHPv13 paper indicates mass flux should have less dampened mass flux than wind. I wonder your opinion on this, thanks!

sdeastham commented 3 days ago

Thanks @yuanjianz ! In your last post, are you saying that you think the shift from wind to mass flux should be having a smaller effect than this? That would be my expectation too - but I want to be sure we're on the same page. It does look to me like there has been a substantial reduction in vertical mixing, but the interesting thing is that this is exactly what we would expect. I'm curious - how do the horizontal mass fluxes compare between the wind and mass-flux simulations?