JGCRI / gcam-core

GCAM -- The Global Change Analysis Model
http://jgcri.github.io/gcam-doc/
Other
263 stars 159 forks source link

Getting v7 running with gcambreakout #393

Closed dobrien13 closed 4 months ago

dobrien13 commented 4 months ago

Good morning GCAM team,

I'm trying to get the United Kingdom broken out from the EU in GCAM v7. I've used the gcambreakout package, creating a new region called United Kingdom (ID 33) with only the UK pulled out from EU-15 (ID 13). This runs successfully.

Now, when running driver() in GCAM Data System v5.1, which worked before the region breakout, I get the following error. Any advice on this would be really helpful.

. . . [1] "module_energy_L154.transportation_UCD" [1] "- make 79.01" [1] "module_energy_L2011.ff_ALL_R_C_Y"
[1] "- make 0.30" [1] "module_energy_L202.Ccoef"
[1] "- make 0.02" [1] "module_energy_L210.resources"
Error in mutate() at ei-gcamdata/R/utils-data.R:32:3: ℹ In argument: environCost = if_else(resource == "coal" & region %in% L210.low_reg, 0, environCost). Caused by error in if_else(): ! false must be a vector, not NULL. Run rlang::last_trace() to see where the error occurred.

[Update] This error is still occurring even after I've reverted the changes to the input data. Could it be something to do with the R packages that needed to be updated for gcambreakout?

pkyle commented 4 months ago

This is a bug in driver() in GCAM 7 (I think driver_drake() might work OK) that will be corrected in the next release; until then you can just either merge or cherry-pick a couple of commits from my forked repo. In the commands below, I'm just naming my forked repo github_pkyle but you could name it whatever you want:

git remote add github_pkyle https://github.com/pkyle/gcam-core.git
git fetch github_pkyle
git merge gpk/bugfix/gcam7_limitsfix

Or alternatively:

git remote add github_pkyle https://github.com/pkyle/gcam-core.git
git fetch github_pkyle
git cherry-pick f3377fc
git cherry-pick 9f40004
dobrien13 commented 4 months ago

Thanks a lot -- pulling the modified zenergy_L210.resources.R file from your repo did the trick. Unfortunately, on the UK-broken-out branch, I'm now getting this issue in driver_drake(). Any ideas?

▶ target L162.bio_YieldRate_R_Y_GLU_irr ▶ target module_water_L171.desalination ✖ fail module_water_L171.desalination Error: target module_water_L171.desalination failed. diagnose(module_water_L171.desalination)$error$message: left_join_no_match: NA values in new data columns diagnose(module_water_L171.desalination)$error$calls: gcamdata:::module_water_L171.desalination("MAKE", c(common.iso_GCAM_regID, aglu.LDS.Land_type_area_ha, water.A71.globaltech_coef, water.AusNWC_desal_techs, water.EFW_mapping, water.aquastat_ctry, water.basin_to_country_mapping, water.DesalData_capacity_basin, water.FAO_desal_AQUASTAT, water.FAO_desal_missing_AQUASTAT, water.nonirrigation_withdrawal, L1011.en_bal_EJ_R_Si_Fi_Yh)) full_join(L171.out_km3_R_desal_tech_Yh, select(EFW_mapping_desal, sector, fuel, technology), by = "technology") %>% left_join_error_no_match(L171.desal_fuel_shares, by = c("GCAM_region_ID", "fuel", "technology", "year")) %>% mutate(desal_km3 = value * share) %>% select(GCAM_region_ID, sector, fuel, technology, year, "desal_km3") select(., GCAMregion

pkyle commented 4 months ago

I've heard of this error happening when you break out a country in the country-to-region mapping files inst/extdata/common/*.csv but don't have the underlying country-level energy data from the IEA Energy Balances. That's a proprietary dataset and as such can't be distributed. Instead gcamdata contains a binary file data/PREBUILT_DATA.rda with all necessary energy (and emissions) data pre-aggregated by GCAM model region. This specific error just happens to be the first of many such errors that would occur as the model tries to create the additional region, but without any underlying data on energy production and consumption. Because of the proprietary data issue, external (i.e. outside of PNNL/UMd) users don't have the capability to modify the default country-to-model-region mapping unless they purchase the IEA World Energy Balances.

dobrien13 commented 4 months ago

That did it -- we do have the IEA data, and I had just forgotten to add it into the extdata/energy folder.

Next update (thanks for sticking with me through these issues...) -- I'm trying two different ways to get this running and both are stuck. The first, using the gcambreakout function, is stuck on the following: ▶ target module_energy_L144.building_det_en ▶ target module_aglu_resbio_input_IRR_MGMT_xml ▶ target module_aglu_L125.LC_tot ✖ fail module_aglu_L125.LC_tot Error: target module_aglu_L125.LC_tot failed. diagnose(module_aglu_L125.LC_tot)$error$message: ERROR: Interannual fluctuation in global land cover exceeds tolerance threshold of 0.005 diagnose(module_aglu_L125.LC_tot)$error$calls: gcamdata:::module_aglu_L125.LC_tot("MAKE", c(L120.LC_bm2_R_UrbanLand_Yh_GLU, L120.LC_bm2_R_Tundra_Yh_GLU, L120.LC_bm2_R_RckIceDsrt_Yh_GLU, L122.LC_bm2_R_HarvCropLand_Yh_GLU, L122.LC_bm2_R_OtherArableLand_Yh_GLU, L123.LC_bm2_R_MgdPast_Yh_GLU, L123.LC_bm2_R_MgdFor_Yh_GLU, L124.LC_bm2_R_Shrub_Yh_GLU_adj, L124.LC_bm2_R_Grass_Yh_GLU_adj, L124.LC_bm2_R_UnMgdPast_Yh_GLU_adj, L124.LC_bm2_R_UnMgdFor_Yh_GLU_adj)) stop("ERROR: Interannual fluctuation in global land cover exceeds tolerance threshold of ", aglu.LAND_TOLERANCE)

where the tibble of interest looks like:

A tibble: 2,723 × 6

Groups: GCAM_region_ID, GLU [180]

GCAM_region_ID GLU year value change_rate change

1 1 GLU007 2007 1264. 0.994 -7.70 2 1 GLU023 2007 130. 0.994 -0.834 3 1 GLU027 1975 0.180 0.994 -0.00116 4 1 GLU027 1976 0.182 1.01 0.00174 5 1 GLU027 1977 0.184 1.01 0.00233 6 1 GLU027 1978 0.187 1.02 0.00287 7 1 GLU027 1979 0.188 1.01 0.00152 8 1 GLU027 1980 0.189 1.01 0.00127 9 1 GLU027 1981 0.191 1.01 0.00102 10 1 GLU027 1982 0.192 1.01 0.00159 # ℹ 2,713 more rows # ℹ Use `print(n = ...)` to see more rows ----------------------------------------------------------------------- The second methodology I'm trying was to manually edit the files listed in gcambreakout and work through the files as needed. On that branch, I'm stuck at the following: ▶ target module_gcamusa_L101.EIA_SEDS ▶ target module_energy_L1011.ff_GrossTrade ✖ fail module_energy_L1011.ff_GrossTrade Error: target module_energy_L1011.ff_GrossTrade failed. diagnose(module_energy_L1011.ff_GrossTrade)$error$message: left_join_no_match: NA values in new data columns diagnose(module_energy_L1011.ff_GrossTrade)$error$calls: gcamdata:::module_energy_L1011.ff_GrossTrade("MAKE", c(common.iso_GCAM_regID, common.GCAM_region_names, emissions.A_PrimaryFuelCCoef, energy.fuel_carbon_content, energy.mappings.comtrade_countrycode_ISO, energy.mappings.comtrade_commodity_code, energy.mappings.comtrade_trade_flow, energy.comtrade_ff_trade, energy.GCAM_region_pipeline_bloc_import, energy.GCAM_region_pipeline_bloc_export)) L1011.comtrade_ff_BiTrade_y_ctry_item_FULL %>% filter(GCAM_Commodity_traded == "gas pipeline") %>% group_by(GCAM_region_ID_exporter = reporter_GCAM_region_ID, GCAM_region_ID_importer = partner_GCAM_region_ID, year, GCAM_Commodity, GCAM_Commodity_traded) %>% summarise(value = sum(value)) %>% ungroup() %>% left_join_error_no_match(GCAM_regio I've seen a similar issue before and believe it had to do something with dbl vs int datatypes in a column but am not positive. Thanks for all the help; appreciate you!
pkyle commented 4 months ago

So I don't know what's going on with the inter-annual fluctuation in land areas; the fact that it's not happening in your manual breakout method means it's probably something to do with the automated breakout not working correctly. It's not something I'd advise spending much time digging into. The error messages on the second method indicate that perhaps your new model region isn't being added to a gas trade region, or some other mapping file used by the fossil fuel trade code. The gcambreakout package isn't officially maintained as part of gcam-core, and probably just hasn't been updated since gas trade was added. I'd say just check all of the mappings files in the dependencies of that code module_energy_L1011.ff_GrossTrade, and make sure your new region is mapped appropriately to a trading block. Running that code in debug can help to work through it, but note that it's possible the issue is upstream of that code too.

dobrien13 commented 4 months ago

Hi @pkyle -- thank you for the help so far! Moving along steadily here with debugging. I'm on to the ceds emissions module and have a relatively vague run error (see below). I'm having difficulty diagnosing as there are so many left_joins in this module. I looked through each of the csv input files required for the module and saw no issues, so it could be either due to a preceding module or due to the code. I read online that "Users that want to build using CEDS raw data, for example to build for different regional aggregations, will need to generate CEDS data using the open-source CEDS system." Is that what's going on here - the CEDS system looks out of date relative to v7 -- any issue there?

Thank you, always!

▶ target elec_segments_water_USA.xml ▶ target module_emissions_L112.ceds_ghg_en_R_S_T_Y ✖ fail module_emissions_L112.ceds_ghg_en_R_S_T_Y Error: target module_emissions_L112.ceds_ghg_en_R_S_T_Y failed. diagnose(module_emissions_L112.ceds_ghg_en_R_S_T_Y)$error$message: left_join_no_match: NA values in new data columns diagnose(module_emissions_L112.ceds_ghg_en_R_S_T_Y)$error$calls: gcamdata:::module_emissions_L112.ceds_ghg_en_R_S_T_Y("MAKE", c(common.GCAM_region_names, common.iso_GCAM_regID, emissions.mappings.CEDS_sector_tech_proc, emissions.mappings.CEDS_sector_tech_proc_revised, energy.mappings.UCD_techs, emissions.mappings.UCD_techs_emissions_revised, energy.calibrated_techs, energy.calibrated_techs_bld_det, emissions.mappings.Trn_subsector, emissions.mappings.Trn_subsector_revised, emissions.CEDS.CEDS_sector_tech_combustion, emissions.CEDS.CEDS_sector_tech_combustion_revised, emissions.EPA_FCCC_IndProc_2005, emissions.mappings.calibrated_outresources, emissions.CEDS.gains_iso_sector_emissions, emissions.CEDS.gains_iso_fuel_emissions, emissions.IP

pkyle commented 4 months ago

My assumption would be that the CEDS emissions files (e.g., extdata/emissions/CEDS/BC_total_CEDS_emissions.csv) need to be added to the workspace. Without these 10 files, which aren't distributable due to their close association with the proprietary IEA World Energy Balances, the emissions data processing will rely on the PREBUILT_DATA (in this case, the outputs of module_emissions_L102.nonco2_ceds_R_S_Y) that are pre-aggregated to region and distributed with the release.