geoschem / gcpy

Python toolkit for GEOS-Chem. Contains basic plotting scripts, plus the suite of GEOS-Chem benchmarking utilities.
https://gcpy.readthedocs.io
Other
51 stars 24 forks source link

Added example script to validate GEOS-Chem photolysis reactions #338

Closed yantosca closed 1 month ago

yantosca commented 2 months ago

Name and Institution (Required)

Name: Bob Yantosca Institution: Harvard + GCST

Describe the update

This PR adds a new script gcpy/examples/geoschem/check_photolysis.py, which validates the GEOS-Chem photolysis mechanism by checking the species_database.yml, FJX_j2j.dat, and fullchem.eqn (or custom.eqn) files.

Expected changes

When calling the script with:

$ python -m gcpy.examples.geoschem.check_photolysis    \
   --spc-db-path  /path/to/species_database.yml        \
   --fjx-j2j-path /path/to/FJX_j2j.dat                 \
   --kpp-eqn-path /path/to/fullchem.eqn (or custom.eqn)

The following output is generated:

Using configuration file /n/holyscratch01/jacob_lab/ryantosca/tests/14.5.0/gc2457/GCClassic/src/GEOS-Chem/run/shared/species_database.yml
Number of entries in FJX_j2j.dat     : 177
Number of photo rxns in fullchem.eqn : 177
O3         uses PHOTOL(2  ) and has Is_Photolysis=True
O3         uses PHOTOL(3  ) and has Is_Photolysis=True
O2         uses PHOTOL(1  ) and has Is_Photolysis=True
NO2        uses PHOTOL(11 ) and has Is_Photolysis=True
H2O2       uses PHOTOL(9  ) and has Is_Photolysis=True
MP         uses PHOTOL(10 ) and has Is_Photolysis=True
CH2O       uses PHOTOL(7  ) and has Is_Photolysis=True
CH2O       uses PHOTOL(8  ) and has Is_Photolysis=True
HNO3       uses PHOTOL(16 ) and has Is_Photolysis=True
HNO2       uses PHOTOL(15 ) and has Is_Photolysis=True
HNO4       uses PHOTOL(17 ) and has Is_Photolysis=True
HNO4       uses PHOTOL(18 ) and has Is_Photolysis=True
NO3        uses PHOTOL(12 ) and has Is_Photolysis=True
NO3        uses PHOTOL(13 ) and has Is_Photolysis=True
N2O5       uses PHOTOL(14 ) and has Is_Photolysis=True
ALD2       uses PHOTOL(61 ) and has Is_Photolysis=True
ALD2       uses PHOTOL(62 ) and has Is_Photolysis=True
PAN        uses PHOTOL(59 ) and has Is_Photolysis=True
APAN       uses PHOTOL(59 ) and has Is_Photolysis=True
ACR        uses PHOTOL(66 ) and has Is_Photolysis=True
AROMCHO    uses PHOTOL(70 ) and has Is_Photolysis=True
RCHO       uses PHOTOL(70 ) and has Is_Photolysis=True
ACET       uses PHOTOL(76 ) and has Is_Photolysis=True
ACET       uses PHOTOL(77 ) and has Is_Photolysis=True
MEK        uses PHOTOL(69 ) and has Is_Photolysis=True
GLYC       uses PHOTOL(68 ) and has Is_Photolysis=True
GLYX       uses PHOTOL(72 ) and has Is_Photolysis=True
GLYX       uses PHOTOL(73 ) and has Is_Photolysis=True
GLYX       uses PHOTOL(74 ) and has Is_Photolysis=True
MGLY       uses PHOTOL(71 ) and has Is_Photolysis=True
MVK        uses PHOTOL(63 ) and has Is_Photolysis=True
MVK        uses PHOTOL(64 ) and has Is_Photolysis=True
MVK        uses PHOTOL(65 ) and has Is_Photolysis=True
MACR       uses PHOTOL(66 ) and has Is_Photolysis=True
HAC        uses PHOTOL(75 ) and has Is_Photolysis=True
PRPN       uses PHOTOL(79 ) and has Is_Photolysis=True
ETP        uses PHOTOL(80 ) and has Is_Photolysis=True
RA3P       uses PHOTOL(81 ) and has Is_Photolysis=True
RB3P       uses PHOTOL(82 ) and has Is_Photolysis=True
R4P        uses PHOTOL(83 ) and has Is_Photolysis=True
R7P        uses PHOTOL(83 ) and has Is_Photolysis=True
ALK4P      uses PHOTOL(83 ) and has Is_Photolysis=True
PP         uses PHOTOL(84 ) and has Is_Photolysis=True
RP         uses PHOTOL(85 ) and has Is_Photolysis=True
R4N2       uses PHOTOL(98 ) and has Is_Photolysis=True
R7N2       uses PHOTOL(98 ) and has Is_Photolysis=True
RNO3       uses PHOTOL(98 ) and has Is_Photolysis=True
ALK4N2     uses PHOTOL(98 ) and has Is_Photolysis=True
MAP        uses PHOTOL(99 ) and has Is_Photolysis=True
Br2        uses PHOTOL(23 ) and has Is_Photolysis=True
BrO        uses PHOTOL(28 ) and has Is_Photolysis=True
HOBr       uses PHOTOL(32 ) and has Is_Photolysis=True
BrNO3      uses PHOTOL(29 ) and has Is_Photolysis=True
BrNO3      uses PHOTOL(30 ) and has Is_Photolysis=True
BrNO2      uses PHOTOL(31 ) and has Is_Photolysis=True
CHBr3      uses PHOTOL(56 ) and has Is_Photolysis=True
CH2Br2     uses PHOTOL(55 ) and has Is_Photolysis=True
CH3Br      uses PHOTOL(50 ) and has Is_Photolysis=True
CH3Cl      uses PHOTOL(43 ) and has Is_Photolysis=True
CH2Cl2     uses PHOTOL(45 ) and has Is_Photolysis=True
BrCl       uses PHOTOL(33 ) and has Is_Photolysis=True
Cl2        uses PHOTOL(22 ) and has Is_Photolysis=True
ClO        uses PHOTOL(27 ) and has Is_Photolysis=True
OClO       uses PHOTOL(25 ) and has Is_Photolysis=True
Cl2O2      uses PHOTOL(26 ) and has Is_Photolysis=True
ClNO2      uses PHOTOL(21 ) and has Is_Photolysis=True
ClNO3      uses PHOTOL(19 ) and has Is_Photolysis=True
ClNO3      uses PHOTOL(20 ) and has Is_Photolysis=True
HOCl       uses PHOTOL(24 ) and has Is_Photolysis=True
CH3CCl3    uses PHOTOL(44 ) and has Is_Photolysis=True
CCl4       uses PHOTOL(42 ) and has Is_Photolysis=True
CFC11      uses PHOTOL(37 ) and has Is_Photolysis=True
CFC12      uses PHOTOL(38 ) and has Is_Photolysis=True
CFC113     uses PHOTOL(39 ) and has Is_Photolysis=True
CFC114     uses PHOTOL(40 ) and has Is_Photolysis=True
CFC115     uses PHOTOL(41 ) and has Is_Photolysis=True
HCFC123    uses PHOTOL(47 ) and has Is_Photolysis=True
HCFC141b   uses PHOTOL(48 ) and has Is_Photolysis=True
HCFC142b   uses PHOTOL(49 ) and has Is_Photolysis=True
HCFC22     uses PHOTOL(46 ) and has Is_Photolysis=True
H1301      uses PHOTOL(53 ) and has Is_Photolysis=True
H1211      uses PHOTOL(51 ) and has Is_Photolysis=True
H2402      uses PHOTOL(54 ) and has Is_Photolysis=True
ClOO       uses PHOTOL(101) and has Is_Photolysis=True
I2         uses PHOTOL(114) and has Is_Photolysis=True
HOI        uses PHOTOL(115) and has Is_Photolysis=True
IO         uses PHOTOL(116) and has Is_Photolysis=True
OIO        uses PHOTOL(117) and has Is_Photolysis=True
INO        uses PHOTOL(118) and has Is_Photolysis=True
IONO       uses PHOTOL(119) and has Is_Photolysis=True
IONO2      uses PHOTOL(120) and has Is_Photolysis=True
I2O2       uses PHOTOL(121) and has Is_Photolysis=True
CH3I       uses PHOTOL(122) and has Is_Photolysis=True
CH2I2      uses PHOTOL(123) and has Is_Photolysis=True
CH2ICl     uses PHOTOL(124) and has Is_Photolysis=True
CH2IBr     uses PHOTOL(125) and has Is_Photolysis=True
I2O4       uses PHOTOL(126) and has Is_Photolysis=True
I2O3       uses PHOTOL(127) and has Is_Photolysis=True
IBr        uses PHOTOL(128) and has Is_Photolysis=True
ICl        uses PHOTOL(129) and has Is_Photolysis=True
MPN        uses PHOTOL(103) and has Is_Photolysis=True
MPN        uses PHOTOL(104) and has Is_Photolysis=True
ATOOH      uses PHOTOL(97 ) and has Is_Photolysis=True
N2O        uses PHOTOL(36 ) and has Is_Photolysis=True
OCS        uses PHOTOL(34 ) and has Is_Photolysis=True
SO4        uses PHOTOL(100) and has Is_Photolysis=True
NO         uses PHOTOL(6  ) and has Is_Photolysis=True
PIP        uses PHOTOL(105) and has Is_Photolysis=True
ETHLN      uses PHOTOL(107) and has Is_Photolysis=True
MONITS     uses PHOTOL(111) and has Is_Photolysis=True
MONITU     uses PHOTOL(112) and has Is_Photolysis=True
HONIT      uses PHOTOL(113) and has Is_Photolysis=True
NITs       uses PHOTOL(130) and has Is_Photolysis=True
NITs       uses PHOTOL(131) and has Is_Photolysis=True
NIT        uses PHOTOL(132) and has Is_Photolysis=True
NIT        uses PHOTOL(133) and has Is_Photolysis=True
MENO3      uses PHOTOL(134) and has Is_Photolysis=True
ETNO3      uses PHOTOL(135) and has Is_Photolysis=True
IPRNO3     uses PHOTOL(136) and has Is_Photolysis=True
NPRNO3     uses PHOTOL(137) and has Is_Photolysis=True
HMHP       uses PHOTOL(86 ) and has Is_Photolysis=True
HPETHNL    uses PHOTOL(87 ) and has Is_Photolysis=True
PYAC       uses PHOTOL(88 ) and has Is_Photolysis=True
PROPNN     uses PHOTOL(89 ) and has Is_Photolysis=True
MVKHC      uses PHOTOL(90 ) and has Is_Photolysis=True
MVKHCB     uses PHOTOL(91 ) and has Is_Photolysis=True
MVKHP      uses PHOTOL(92 ) and has Is_Photolysis=True
MVKPC      uses PHOTOL(93 ) and has Is_Photolysis=True
MCRENOL    uses PHOTOL(94 ) and has Is_Photolysis=True
MCRHP      uses PHOTOL(95 ) and has Is_Photolysis=True
MACR1OOH   uses PHOTOL(96 ) and has Is_Photolysis=True
MVKN       uses PHOTOL(108) and has Is_Photolysis=True
MCRHN      uses PHOTOL(109) and has Is_Photolysis=True
MCRHNB     uses PHOTOL(110) and has Is_Photolysis=True
RIPA       uses PHOTOL(138) and has Is_Photolysis=True
RIPB       uses PHOTOL(139) and has Is_Photolysis=True
RIPC       uses PHOTOL(140) and has Is_Photolysis=True
RIPD       uses PHOTOL(141) and has Is_Photolysis=True
HPALD1     uses PHOTOL(142) and has Is_Photolysis=True
HPALD2     uses PHOTOL(143) and has Is_Photolysis=True
HPALD3     uses PHOTOL(144) and has Is_Photolysis=True
HPALD4     uses PHOTOL(145) and has Is_Photolysis=True
IHN1       uses PHOTOL(146) and has Is_Photolysis=True
IHN2       uses PHOTOL(147) and has Is_Photolysis=True
IHN3       uses PHOTOL(148) and has Is_Photolysis=True
IHN4       uses PHOTOL(149) and has Is_Photolysis=True
INPB       uses PHOTOL(150) and has Is_Photolysis=True
INPD       uses PHOTOL(151) and has Is_Photolysis=True
INPD       uses PHOTOL(152) and has Is_Photolysis=True
ICN        uses PHOTOL(106) and has Is_Photolysis=True
IDN        uses PHOTOL(78 ) and has Is_Photolysis=True
ICPDH      uses PHOTOL(153) and has Is_Photolysis=True
ICPDH      uses PHOTOL(154) and has Is_Photolysis=True
IDHDP      uses PHOTOL(155) and has Is_Photolysis=True
IDHPE      uses PHOTOL(156) and has Is_Photolysis=True
IDCHP      uses PHOTOL(157) and has Is_Photolysis=True
ITHN       uses PHOTOL(158) and has Is_Photolysis=True
ITHN       uses PHOTOL(159) and has Is_Photolysis=True
ITCN       uses PHOTOL(160) and has Is_Photolysis=True
ITCN       uses PHOTOL(161) and has Is_Photolysis=True
ETHP       uses PHOTOL(162) and has Is_Photolysis=True
BALD       uses PHOTOL(163) and has Is_Photolysis=True
BZCO3H     uses PHOTOL(164) and has Is_Photolysis=True
BENZP      uses PHOTOL(165) and has Is_Photolysis=True
NPHEN      uses PHOTOL(166) and has Is_Photolysis=True
PPN        uses PHOTOL(167) and has Is_Photolysis=True
APINP      uses PHOTOL(168) and has Is_Photolysis=True
PINAL      uses PHOTOL(169) and has Is_Photolysis=True
PINO3H     uses PHOTOL(170) and has Is_Photolysis=True
PINONIC    uses PHOTOL(171) and has Is_Photolysis=True
C96O2H     uses PHOTOL(172) and has Is_Photolysis=True
BPINP      uses PHOTOL(173) and has Is_Photolysis=True
BPINOOH    uses PHOTOL(174) and has Is_Photolysis=True
LIMO3H     uses PHOTOL(175) and has Is_Photolysis=True
LIMO2H     uses PHOTOL(176) and has Is_Photolysis=True
PIP        uses PHOTOL(105) and has Is_Photolysis=True
LIMAL      uses PHOTOL(177) and has Is_Photolysis=True

Output such as this denotes a succesfully-configured photolysis mechanism. All of the photo species should have Is_Photolysis: true in the species_database.yml file. Mismatches of species will generate an error message.

lizziel commented 2 months ago

@yantosca, thanks for putting this together. I have a few questions about the output.

  1. Why is number of entries in FJX_j2j.dat equal to the number of photo reactions in fullchem.eqn? There are cases where a photolyzing species in FJX_j2j.dat has one entry in that file but two equations in fullchem.eqn, e.g. PIP. There are also photolysis equations in fullchem.eqn that reuse J-values for a different species and are not in FJX_j2j.dat. I would think this would make "Number of entries in FJX_j2j.dat" not equal to "Number of photo rxns in fullchem.eqn" except by coincidence.

  2. Why are species that are not in FJX_j2j.dat included as Is_Photolysis=True. For example, R7P and ALK4P below are not in FJX_j2j.dat as shown by their reuse of R4P's J-values.

    R4P        uses PHOTOL(83 ) and has Is_Photolysis=True
    R7P        uses PHOTOL(83 ) and has Is_Photolysis=True
    ALK4P      uses PHOTOL(83 ) and has Is_Photolysis=True

    Any species that have Is_Photolysis=True but are not in FJX_j2j.dat will have output J-value diagnostics that are all zero.

  3. I thought we were going to remove Is_Photolysis from the species database since it is redundant information with FJX_j2j.dat. We could read the dat-file upon initialization and extract the unique number of species.

yantosca commented 2 months ago
  1. Why is number of entries in FJX_j2j.dat equal to the number of photo reactions in fullchem.eqn? There are cases where a photolyzing species in FJX_j2j.dat has one entry in that file but two equations in fullchem.eqn, e.g. PIP. There are also photolysis equations in fullchem.eqn that reuse J-values for a different species and are not in FJX_j2j.dat. I would think this would make "Number of entries in FJX_j2j.dat" not equal to "Number of photo rxns in fullchem.eqn" except by coincidence.

I will double check this. Like you said, it may just be coinicidence.

  1. Why are species that are not in FJX_j2j.dat included as Is_Photolysis=True. For example, R7P and ALK4P below are not in FJX_j2j.dat as shown by their reuse of R4P's J-values.
    R4P        uses PHOTOL(83 ) and has Is_Photolysis=True
    R7P        uses PHOTOL(83 ) and has Is_Photolysis=True
    ALK4P      uses PHOTOL(83 ) and has Is_Photolysis=True

    Any species that have Is_Photolysis=True but are not in FJX_j2j.dat will have output J-value diagnostics that are all zero.

You are correct. We should remove those.

  1. I thought we were going to remove Is_Photolysis from the species database since it is redundant information with FJX_j2j.dat. We could read the dat-file upon initialization and extract the unique number of species.

We will but this is a temporary solution for now.

lizziel commented 2 months ago

I made changes in GEOS-Chem to fix the J-value diagnostics resulting from mismatches in the files. I wrote my own script to figure out the mismatches to keep it simple. My approach is more command line style, amenable to notebooks. I wonder if we are going to put an example for this whether we should do that as well, so it is easier to read and do data analysis with.

import numpy as np
import yaml

# Get content of FJX_j2j.dat files (both fullchem and Hg)                                                                   
with open('FJX_j2j.dat') as file_object:
    data_j2j_fullchem = file_object.readlines()
with open('FJX_j2j.Hg.dat') as file_object:
    data_j2j_hg = file_object.readlines()
data_j2j = data_j2j_fullchem + data_j2j_hg

# Get unique species names across both files                                                                                
data_j2j_allcaps =[x.upper() for x in data_j2j]
j2j_species = [i.split()[1] for i in data_j2j_allcaps if "PHOTON" in i]
j2j_unique_species = np.unique(j2j_species)
print('\nNumber of unique species in FJX_j2j.dat (fullchem and Hg):',len(j2j_unique_species))

# Get species in species_data.yml with Is_Photolysis equal to True                                                          
with open('species_database.yml', 'r') as stream:
    data_db = yaml.safe_load(stream)
db_is_photol = [key for key, value in data_db.items() if value.get('Is_Photolysis') is True]
db_is_photol_allcaps = [x.upper() for x in db_is_photol]

# Store as sets                                                                                                             
j2j_unique_species_set = set(j2j_unique_species)
db_is_photol_set = set(db_is_photol_allcaps)

# Print FJX_j2j.dat species not is_photolysis=true in species_database.yml.                                                 
in_j2j_but_not_db = j2j_unique_species_set.difference(db_is_photol_set)
print('\nThe following species are in FJX_j2j.dat but do not have Is_Photolysis=True in species_database.yml. J-value diagn\ ostics will not be created for them. Check that these species are not photolyzed in the KPP equation file fullchem.eqn and other *.eqn files. If they are not then consider removing them from `FJX_j2j.dat`. If they are then `Is_Photolysis=True` must be added to the species database to output J-value diagnostics for them.')
for i in in_j2j_but_not_db:
    print(i)

# Print species that have is_photolysis=true but are not in FJX_j2j.dat.                                                    
in_db_but_not_j2j = db_is_photol_set.difference(j2j_unique_species_set)
print('\nThe following species are Is_Photolysis=True in species_database.yml but are not species in FJX_j2j.dat for Hg or 
fullchem. J-value diagnostics will be created for them but will be filled with all zeros. The species database should be updated to remove Is_Photolysis for these species.')
for i in in_db_but_not_j2j:
    print(i)

# Ideas for further checks:                                                                                                 
# More checks: Open fullchem.eqn and check that:                                                                            
#   - every index in FJX_j2j.dat is used as index in PHOTOL array                                                           
#   - every index in fullchem.eqn PHOTOL array is in FJX_j2j.dat                                                            
#   - Make a printout of each FJX_j2j.dat equation and which equation(s) in                                                 
#     fullchem.eqn use it. 

The output looks like this:

Number of unique species in FJX_j2j.dat (fullchem and Hg): 175

The following species are in FJX_j2j.dat but do not have Is_Photolysis=True in species_database.yml.
J-value diagnostics will not be created for them. Check that these species are not photolyzed in the
KPP equation file fullchem.eqn and other *.eqn files. If they are not then consider removing them from 
`FJX_j2j.dat`. If they are then `Is_Photolysis=True` must be added to the species database to output 
J-value diagnostics for them.
H2O
SO2
HO2
CF3I
H12O2

The following species are Is_Photolysis=True in species_database.yml but are not species in
FJX_j2j.dat for Hg or fullchem. J-value diagnostics will be created for them but will be filled
with all zeros. The species database should be updated to remove Is_Photolysis for these species.
HGCL
R7N2
O3ATBL
O3MT
R7P
ACR
O3USA
O3INIT
HG_OTHER_PROP
O3ASBL
O3_PROP
O3AFBL
O3NABL
APAN
ALK4N2
O3UT
AROMCHO
RNO3
O3STRAT
ALK4P
O3PCBL
O3EUBL
O3ROW
yantosca commented 1 month ago

Thanks @lizziel. If you want I can close this PR and have you add a new PR with your script.

lizziel commented 1 month ago

Okay, I am going to let the dust settle on the J-values diagnostics and species database before adding any example scripts to GCPy. We are still figuring out what changes will be made for the diagnostic and whatever is done will affect the checks needed. GCST can use local scripts to check and I don't think we need to share the scripts on GCPy just yet.