geoschem / geos-chem

GEOS-Chem "Science Codebase" repository. Contains GEOS-Chem science routines, run directory generation scripts, and interface code. This repository is used as a submodule within the GCClassic and GCHP wrappers, as well as in other modeling contexts (external ESMs).
http://geos-chem.org
Other
167 stars 160 forks source link

[BUG/ISSUE] Parallelization issues in GEOS-Chem Classic simulations #1637

Open yantosca opened 1 year ago

yantosca commented 1 year ago

What institution are you from?

GCST

Description of the problem

The new GEOS-Chem Classic parallelization test capability (see PR #1565 and PR #1636) has revealed that some GEOS-Chem Classic simuations have bugs in parallel loops. This is most likely due to certain loop variables not being declared !$OMP PRIVATE.

==============================================================================
GEOS-Chem Classic: Parallelization Test Results

GCClassic #dbc82b4 GEOS-Chem submodule updates: Add operational example run scripts for UCI Australia supercomputer Gadi
GEOS-Chem #75eccc02c Further edits in GCClasic and GCHP README.md files for PR #1565
HEMCO     #7de865f Merge branch 'feature/retire_wli_metfield' into dev/3.6.0

1st run uses 24 OpenMP threads
2nd run uses 13 OpenMP threads
Number of parallelization tests: 22

Submitted as SLURM job: 40894606
==============================================================================

Parallelization tests:
------------------------------------------------------------------------------
gc_05x0625_NA_47L_merra2_CH4........................Execute Simulation....PASS
gc_4x5_47L_merra2_fullchem..........................Execute Simulation....PASS
gc_4x5_47L_merra2_fullchem_TOMAS15..................Execute Simulation....FAIL
gc_4x5_merra2_aerosol...............................Execute Simulation....PASS
gc_4x5_merra2_CH4...................................Execute Simulation....PASS
gc_4x5_merra2_fullchem..............................Execute Simulation....PASS
gc_4x5_merra2_fullchem_aciduptake...................Execute Simulation....PASS
gc_4x5_merra2_fullchem_APM..........................Execute Simulation....FAIL
gc_4x5_merra2_fullchem_benchmark....................Execute Simulation....PASS
gc_4x5_merra2_fullchem_complexSOA...................Execute Simulation....PASS
gc_4x5_merra2_fullchem_complexSOA_SVPOA.............Execute Simulation....PASS
gc_4x5_merra2_fullchem_LuoWd........................Execute Simulation....FAIL
gc_4x5_merra2_fullchem_marinePOA....................Execute Simulation....PASS
gc_4x5_merra2_fullchem_RRTMG........................Execute Simulation....PASS
gc_4x5_merra2_Hg....................................Execute Simulation....FAIL
gc_4x5_merra2_metals................................Execute Simulation....PASS
gc_4x5_merra2_POPs_BaP..............................Execute Simulation....PASS
gc_4x5_merra2_tagCH4................................Execute Simulation....PASS
gc_4x5_merra2_tagCO.................................Execute Simulation....PASS
gc_4x5_merra2_tagO3.................................Execute Simulation....PASS
gc_4x5_merra2_TransportTracers......................Execute Simulation....PASS
gc_4x5_merra2_TransportTracers_LuoWd................Execute Simulation....PASS

Summary of test results:
------------------------------------------------------------------------------
Parallelization tests passed: 18
Parallelization tests failed: 4
Parallelization tests not yet completed: 0

We will try to correct these issues for 14.1.1

yantosca commented 1 year ago

The differences in the Hg simulation were traced to an improperly-parallelized DO loop in GeosCore/ocean_mercury_mod.F90. This is now fixed in commit 426dde3d8.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If there are no updates within 7 days it will be closed. You can add the "never stale" tag to prevent the Stale bot from closing this issue.

stale[bot] commented 1 year ago

Closing due to inactivity

yantosca commented 1 year ago

Reopened this issue which was shut by stalebot to denote that there are still parallelization issues to solve.