Scaling emissions based on a regional map file via HEMCO in GCHP v14

Twize commented 1 year ago

Name: Tyler Wizenberg Institution: University of Toronto

Confirm you have reviewed the following documentation

[x] Support guidelines
[x] User manuals
[x] Debugging GEOS-Chem and HEMCO errors
[x] Current and past Github issues

Description of your issue or question

Hi! I am currently trying to scale the CO biomass burning emissions from GFAS (using emission ratios in molar units) to approximate the emissions of some species not included in the GFAS inventory (e.g., formic acid, acetylene). To do this, I am trying to follow the first example on this page: https://hemco.readthedocs.io/en/latest/hco-ref-guide/more-examples.html. I have generated a file on a 0.25° x 0.25 lat-lon grid containing the 'basis regions' used by GFED4.1, which I intend to use for masking/applying the various scale factors in GFAS.

basis_regions_in_GFED4 1_basis_regions_0 25x0 25

The values in the basis_regions variable in the file correspond to the different 'basis regions', for example, 0 = Ocean, 1 = Boreal North America, 2 = Temperate North America etc. GCHP seems to be able to read in this file without any issues. I then have a corresponding .txt file with the various scale factors that I want to apply to each of the regions (based on the 'scalefactors.txt' file from the HEMCO example):

HCOOH_GFAS_scalefactors.txt

However, when I add this HCOOH_GFAS_scalefactors.txt file to HEMCO_Config.rc, I get the following error at runtime:

HEMCO ERROR [0007]: Cannot open $ROOT/MASKS/HCOOH_GFAS_scalefactors.txt
 --> LOCATION: HCOIO_ReadCountryValues (hcoio_util_mod.F90)

HEMCO ERROR [0007]: ERROR 9
 --> LOCATION: HCOIO_ReadOther (HCOIO_UTIL_MOD.F90)

HEMCO ERROR [0007]: Error in HCOIO_ReadOther called from HEMCO ReadList_Fill: HCOOH_SCALEFACTORS
 --> LOCATION: ReadList_Fill (HCO_ReadList_Mod.F90)

HEMCO ERROR [0007]: Error in ReadList_Fill (1) called from HEMCO ReadList_Read
 --> LOCATION: ReadList_Read (HCO_ReadList_Mod.F90)
 Error in ReadList_Read called from hco_run

I don't believe its a permissions issue because I gave full r-w-x permissions on the file. Is this functionality of HEMCO not available in GCHP? Or is the issue just that GCHP/HEMCO/MAPL unable to read .txt files? If this is the case, is there a simple workaround that I may be able to use?

I can split up the basis regions into their own masks if needed, but I figured that if this approach was possible, it would probably be the easiest.

Many thanks in advance!

GCHP version and relevant files

GCHP v14.1.1

GFED4.1 Basis region file: GFED4.1_basis_regions_0.25x0.25.nc.zip gchp.log: gchp.log HEMCO Config: HEMCO_Config.rc.txt ExtData.rc: ExtData.rc.txt

lizziel commented 1 year ago

Hi Tyler, MAPL does not have the ability to open text files for scaling. Try changing that file to netCDF.

lizziel commented 1 year ago

Also note you will need to add the new netCDF file to ExtData.rc. You can added it directly after the entry for GFAS_EMITL.

Twize commented 1 year ago

Hi @lizziel, thanks for the quick reply! I assume I will have to structure the new .nc file in a specific way such that HEMCO can interpret it and appropriately distribute the scale factors to the corresponding regions.

In the HEMCO example they state that the first line of the .txt has to be the name of the shapefile container, and then the columns are the names (although this is probably just to make the file easier to read for other people), the IDs/reference values, and then the scale factors. Im just a bit unsure of how this would translate into the datastructure of a typical .nc file. Do the variables in the .nc file have to be named something specific so that HEMCO knows to look for them?

yantosca commented 1 year ago

@Twize: I am not sure if the country scaling has ever been tested in GCHP. Certainly it works for GCClassic but for GCHP all inputs come thru ExtData so it is more involved.

Twize commented 1 year ago

Hi @yantosca, ah that makes sense. I knew it seemed a bit too straightforward...

I guess the next best approach would probably be to separate the regions into their own individual masks and assign the scaling factors that way?

lizziel commented 1 year ago

Hi @Twize, sorry, I think I read through your issue a little quickly and didn't realize your text file was not gridded, and also that it is being read by HEMCO and not MAPL. I'm taking a look at HEMCO since it is curious it is failing during a subroutine that specifically reads the file as a text file (HCOIO_ReadOther), and which is also enabled within GCHP. We do read text files in GEOS-Chem and HEMCO in GCHP, e.g. the config files, so I don't think this should be failing as long as it is going down the proper logic paths to not use ExtData, and it seems it is.

Could you run with verbose and warnings set to 3 in HEMCO_Config.rc? You can also use fewer cores (6) to reduce the number of identical prints to the log.

lizziel commented 1 year ago

Adding on to this, I wonder if $ROOT is not being evaluated in GCHP and hence it doesn't know where the text file is. That would make sense since we skip the file path entry with the assumption that all reads will be in GCHP. I can investigate that further.

Twize commented 1 year ago

Adding on to this, I wonder if $ROOT is not being evaluated in GCHP and hence it doesn't know where the text file is. That would make sense since we skip the file path entry with the assumption that all reads will be in GCHP. I can investigate that further.

I can't unfortunately run with fewer cores than this because on Scinet Niagara the nodes are 40-cores each, so using 1 or 2 nodes leads to the total number of cores not being divisible by 6. I can try specifying the absolute filepath instead of the relative filepath with respect to $ROOT and see if that helps.

lizziel commented 1 year ago

Good idea. Let us know if that works. Regarding Scinet Niagara, you should be able to run a job requesting a full node but only use a subset of those in GCHP. Let me know if you would like some guidance on that. I see that the compute cluster uses SLURM so it is similar to our system at Harvard.

Twize commented 1 year ago

@lizziel I tried specifying the absolute file path to the HCOOH_GFAS_scalefactors.txt file, and it does seem to find the file now, but I get the following error at runtime:

At line 2345 of file /scratch/d/dylan/tylerw/GCHP_v14.1.1_full/src/GCHP_GridComp/HEMCO_GridComp/HEMCO/src/Core/hcoio_util_mod.F90 Fortran runtime error: Bad integer for item 1 in list input

So it may have issues parsing the .txt file and its contents it would seem. I tried commenting out the HEMCO_Config.rc line where I specify the HCOOH_GFAS_scalefactors.txt file and the model runs to completion, so the error I get does seem to be related to this file.

Twize commented 1 year ago

Hi @lizziel, just a quick update, the previous Bad integer for item 1 in list input error was caused by me having the ID's specified as floats in the HCOOH_GFAS_scalefactors.txt file (I originally had this because I thought MAPL was reading the file not HEMCO). Changing them back to integers seems to help me progress past this.

Just to see more info I switched the HEMCO verbose flag to 3, and started a new test simulation. Now, judging from the logs, it does actually seem to be reading the HCOOH_GFAS_scalefactors.txt file and pulling the values from it (and I assume applying the scale factors?), which is shown in the gchp.log:

Use country-specific values for HCOOH_SCALEFACTORS
- Source file: /project/d/dylan/ctm/ExtData/HEMCO/MASKS/HCOOH_GFAS_scalefactors.txt
- Use ID mask GFED4_BASIS_REGION_MASK
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for OCEAN ==> ID:           0
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for BONA ==> ID:           1
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for TENA ==> ID:           2
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for CEAM ==> ID:           3
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for NHSA ==> ID:           4
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for SHSA ==> ID:           5
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for EURO ==> ID:           6
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for MIDE ==> ID:           7
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for NHAF ==> ID:           8
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for SHAF ==> ID:           9
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for BOAS ==> ID:          10
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for CEAS ==> ID:          11
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for SEAS ==> ID:          12
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for EQAS ==> ID:          13
 Data was in units of count - unit conversion factor is    1.0000000000000000
 - Obtained values for AUST ==> ID:          14

It appears to be running, and is producing output files. Whether or not its correctly applying the scaling factors to the various regions, I am not sure yet and I will probably have to do a longer simulation to check, but this does seem promising!

lizziel commented 1 year ago

Excellent! I will close out this issue. If you find that application is incorrect then you can create a new issue on the HEMCO github.

lizziel commented 1 year ago

I updated HEMCO docs for 14.2.0 to specify that (1) path must be absolute for the country scale factors file in GCHP, and (2) IDs must be type integer in the file.

geoschem / GCHP