geoschem / geos-chem

GEOS-Chem "Science Codebase" repository. Contains GEOS-Chem science routines, run directory generation scripts, and interface code. This repository is used as a submodule within the GCClassic and GCHP wrappers, as well as in other modeling contexts (external ESMs).
http://geos-chem.org
Other
164 stars 156 forks source link

[FEATURE REQUEST] Adding HTAPv3 as a global emission inventory #1301

Closed 1Dandan closed 1 year ago

1Dandan commented 2 years ago

Hi, I am writing to request adding HTAPv3 into GEOS-Chem model

Original data and processing

The original HTAPv3 is downloaded from https://edgar.jrc.ec.europa.eu/dataset_htap_v3 Readme file for HTAPv3: README.txt HTAPv3 is processed with sector definition in the README file. NOx emissions have been converted to equivalent NO emission fluxes. Processing code: Cal_edgar-HTAPv3_lumped_sectors_MonMean_2000-2018.py.txt chunking.sh.txt

Validation of HTAPv3 emissions

The HTAPv3 emissions are evaluated against the default global emission inventories of CEDS v2 for annual mean emission fluxes in 2018. For comparison, the HTAPv3 emission fluxes are conservatively regridded to 0.5x0.5 resolution.

Comparison results: https://wustl.box.com/s/ae916nc7sc1wusf87ms4cvhslb0jl0fe

Implementation of HTAPv3 in GCHP and species concentrations evaluation

HTAPv3 inventory is used as emission inputs in GCHP v13.2.1 at C48. I added a soft link to HTAPv3 data in my run directory and configuration files below is modified accordingly.

Configuration files: ExtData.rc.txt HEMCO_Config.rc.txt HEMCO_Diagn.rc.txt

To evaluate species concentrations with HTAPv3, GCHP v13.2.1 C48 with default emissions of CEDSv2 is conducted. Standard fullchem mechanism is used for both simulations with GEOS-FP meteorology for year of 2018. The only difference between them is the global emission inventory. All other settings are left as default.

Comparison results for annual mean species concentrations in 2018 from two inventories: https://wustl.box.com/s/ol6aw5ji071o5f8rkxgwgbivdphmplov

msulprizio commented 2 years ago

Thanks @1Dandan. I've marked this item as "Delivered and in the queue" on the Model development priorities page. We can aim to get this into 14.1.0.

Would it be possible to put the processed emissions files directly on the WashU data server (http://geoschemdata.wustl.edu/ExtData/HEMCO/)? @LiamBindle or @YanshunLi-washu may be able to help you.

1Dandan commented 2 years ago

Thanks. The HTAPv3 emission inventory should be accessible at: http://geoschemdata.wustl.edu/ExtData/HEMCO/HTAPv3/v2022-07/

msulprizio commented 2 years ago

Excellent. I see it there now. Thank you!

deepangsu commented 2 years ago

Just wanted to confirm if you have any idea why this might be the error when I use the HEMCO files and ExtData files listed here.

HEMCO ERROR [0005]: Error encountered in "HCOX_Volcano_Init"! --> LOCATION:
-> at HCOX_INIT (in module HEMCO/Extensions/hcox_driver_mod.F90

GEOS-Chem ERROR [0005]: Error encountered in "HCOX_Init"! --> LOCATION: -> a t HCOI_GC_Init (in module GeosCore/hco_interface_gc_mod.F90)

GEOS-Chem ERROR [0005]: Error encountered in "HCOI_GC_Init"! --> LOCATION: -

at Emissions_Init (in module GeosCore/emissions_mod.F90) pe=00005 FAIL at line=00533 gchp_chunk_mod.F90 pe=00005 FAIL at line=01857 Chem_GridCompMod.F90 pe=00005 FAIL at line=01844 MAPL_Generic.F90 <Error during the 'Initialize' stage of the gridded component 'GCHPchem'> pe=00005 FAIL at line=01506 MAPL_Generic.F90 pe=00005 FAIL at line=00313 GCHP_GridCompMod.F90 pe=00005 FAIL at line=01844 MAPL_Generic.F90 <Error during the 'Initialize' stage of the gridded component 'GCHP'> pe=00005 FAIL at line=00627 MAPL_CapGridComp.F90 pe=00005 FAIL at line=00920 MAPL_CapGridComp.F90 pe=00005 FAIL at line=00245 MAPL_Cap.F90 pe=00005 FAIL at line=00211 MAPL_Cap.F90 pe=00005 FAIL at line=00154 MAPL_Cap.F90 pe=00005 FAIL at line=00129 MAPL_Cap.F90 pe=00005 FAIL at line=00031 GCHPctm.F90

deepangsu commented 2 years ago

The Volcano_Climatology : $ROOT/VOLCANO/v2021-09/so2_volcanic_emissions_CARN_v202005.degassing_only.rc

is missing in the HEMCO_Confi.rc file missing. Including this resource script would make is work.

LiamBindle commented 2 years ago

@deepangsu Please keep discussion to the titled topic. You should open a new issue. We keep discussions to 1 topic for the sake of archiving and keeping track of open issues.

yantosca commented 1 year ago

@1Dandan: I have taken over this feature request. Am pulling the data to Harvard. We might be able to compress the data so it takes up less space. I will also let you know if the netCDF files have any issues. Thanks for preparing this and sorry it took us a while to get to this.

yantosca commented 1 year ago

@Jourdan-He, @SaptSinha, @1Dandan: I have placed the HTAPv3 data in the file path: http://ftp.as.harvard.edu/gcgrid/data/ExtData/HEMCO/HTAPv3/v2022-12, which is now subdivided into directories (2000 through 2018). You can now copy these files back to geoschemdata.wustl.edu at your convenience and remove HTAPv3/v2022-07 data.

Also, I noticed that the "calendar" attribute was misspelled (i.e. time:calender instead of time:calendar). I've fixed these files on the Harvard server.

Integration tests are now running.

yantosca commented 1 year ago

All GCHP integration tests passed:

==============================================================================
GCHP: Execution Test Results

Number of execution tests: 3
==============================================================================

Execution tests:
------------------------------------------------------------------------------
gchp_fullchem_benchmark_merra2_c48...............Execute Simulation.....PASS
gchp_fullchem_standard_merra2_c24................Execute Simulation.....PASS
gchp_TransportTracers_geosfp_c24.................Execute Simulation.....PASS

Summary of execution test results:
------------------------------------------------------------------------------
Execution tests passed:        3
Execution tests failed:        0
Execution tests not completed: 0

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%  All execution tests passed!  %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1Dandan commented 1 year ago

Great! Thanks @yantosca for implementing and testing it. Are the files the same as those under /HEMCO/HTAPv3/v2022-12 at Globus endpoint of GEOS-Chem Data (Havard)? I have finished transferring through Globus to http://geoschemdata.wustl.edu/ExtData/HEMCO/HTAPv3/v2022-12/.

yantosca commented 1 year ago

Great! Thanks @yantosca for implementing and testing it. Are the files the same as those under /HEMCO/HTAPv3/v2022-12 at Globus endpoint of GEOS-Chem Data (Havard)? I have finished transferring through Globus to http://geoschemdata.wustl.edu/ExtData/HEMCO/HTAPv3/v2022-12/.

Yes, those are the right files @1Dandan. Sorry for the late reply, I didn't see your comment right away.

yantosca commented 1 year ago

@1Dandan @Jourdan-He @SaptSinha @msulprizio @lizziel: I was updating the isCoards script and I discovered that the HTAPv3 files have a couple of issues that would make them not able to be read into GCHP:

The following items DO NOT ADHERE to the COARDS standard:
---------------------------------------------------------------------------
-> time[0] != 0 (this is required for GCHP)
-> "time:units" contains decimals (problem for GCHP)

I can take care of updating the files but we'll need to download them to WashU again. I will let you know when they are ready.

1Dandan commented 1 year ago

Thanks @yantosca for checking the files.

I am setting the files with reference time at 2000-01-01 00:00 (the value for time of 2000-01-01 00:00 is 0) for all files from year 2000 to 2018. By time[0] != 0, do you mean I need to set every file with time[0] = 0? Say for each file for year 2001, I need to set time 0 at 2001-01-01?

I am just wondering what is the correct way to do it for the sake of future file processing.

yantosca commented 1 year ago

Thanks @1Dandan. Yes, for GCHP, the first time in the file must always be zero. For example, if the time:units = hours since 2010-01-01 00:00:00, then the first time value in the file must be 0 (so that the first time point is 2010-01-01 00:00:00).

I've just updated the isCoards script in the https://github.com/geoschem/netcdf-scripts repo to check for issues that would render a netCDF file unreadable by GCHP.

Also, I re-uploaded the ExtData/HEMCO/v2022-12 folder to Harvard. The files should now be OK for GCHP, all you have to do is grab them.

yantosca commented 1 year ago

@1Dandan: Also note: GCClassic doesn't care if the first time point in a netCDF file is zero. But because GCHP uses the MAPL library, that is an extra restriction.

However, it is probably good practice for the first time point in the file to match the time:units value, or else it could potentially lead to confusion.

1Dandan commented 1 year ago

Hi @yantosca, thanks for explanation, but somehow I am still confused. As the unit I used for time in all files is days since 2000-01-01 00:00:00, the value for time variable at 2000-01-01 is 0. For files in year 2018, the value for time[0] is the days after 2000-01-01. I also observed a similar practice for CEDSv2 (v2021-06) where the unit for time variable is days since 1950-1-1 00:00:00 and only the value for 1950-1-1 is zero. So for files in other years say 2019, the value for time variable time[0] is not zero but the actual days after 1950-1-1.

yantosca commented 1 year ago

So for files in other years say 2019, the value for time variable time[0] is not zero but the actual days after 1950-1-1.

And this is what GCHP will not allow.

For 2018-01-01, time:units should be time:units = days since 2018-01-01 00:00:00 For 2017-01-01, time:units should be: time:units = days since 2017-01-01 00:00:00 ... for 2000-01-01, time:units should be time:units = days since 2000-01-01 00:00:00 ... for 1950-01-01, time:units should be time:units = days since 1950-01-01 00:00:00. ... etc.

1Dandan commented 1 year ago

But isn't CEDSv2 already implemented and used properly in current GCHP simulations? Or is it a restriction for new version of GCHP?

One emission file example for CEDSv2 on WashU Compute1 cluster:

netcdf file:/Volumes/rvmartin/Active/GEOS-Chem-shared/ExtData/HEMCO/CEDS/v2021-06/2019/ALD2-em-anthro_CMIP_CEDS_2019.nc {
  dimensions:
    time = UNLIMITED;   // (12 currently)
    lon = 720;
    lat = 360;
  variables:
    double time(time=12);
      :standard_name = "time";
      :long_name = "time";
      :units = "days since 1950-1-1 00:00:00";
      :calendar = "standard";
      :axis = "T";
      :_ChunkSizes = 1U; // uint

and the value for the time valuable: image

msulprizio commented 1 year ago

The PR for this update is now merged into dev/14.1.0.