pik-piam / remind2

The remind2 package contains the REMIND-specific routines for data and model output manipulation.
0 stars 40 forks source link

Fix issue when subsector variables not available #474

Open gabriel-abrahao opened 12 months ago

gabriel-abrahao commented 12 months ago

This PR is motivated by the implementation of the AR6 climate assessment, and basically just improves reportEmis handling of GDXs without subsector information. This used to be case for compatibility with the fixed_shares of the industry module, that has been stagnant for a while.

But it's also useful because it makes the function work with GDXs written after the loop but before the postsolve, which is necessary for evaluating climate outcomes between iterations before they are needed for the damage step of postsolve. Of course, the subsector variables are not written.

In the case of operating with pre-postsolve GDXs, this can also affect some summations, notably Emi|CO2|Energy|+|Demand. The function wasn't designed for this in mind anyway, but the totals relevant for the climate assessment seem to work. And since there's no CCS in fixed_shares, this shouldn't affect summations in that case.

gabriel-abrahao commented 11 months ago

Yes, I have calls that fail on this file here:

/p/projects/piam/abrahao/scratch/remind_ca/output/SSP2EU-NPi-ar6_2023-09-18_16.30.58/fulldata_prepostsolve.gdx

It happens because it's the first iteration, so postsolve wasn't run yet. In this particular case, we could skip running this step (the climate assessment) in the first iteration, but I thought it would be better to have reportEmi more robust instead, as maybe we have an use case that doesn't conform later. Do you believe skipping it would be a better option?

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 11 months ago

Yes, I have calls that fail on this file here:

/p/projects/piam/abrahao/scratch/remind_ca/output/SSP2EU-NPi-ar6_2023-09-18_16.30.58/fulldata_prepostsolve.gdx

I did not know only variables and sets get carried over from the input.gdx, not parameters. Now I know.

So, there are a couple of options before us.

  1. Expand reportEmi() as suggested here. While that would get your module to run, I do not think it would add to the robustness of the function. At 2500 lines it is more than complex enough already, and it will not be clear to the next person why this subsector information might be not available, and why these safeguards are in place. I am pretty confident in fact that the next extension of the industry emissions reporting would break the new ar6 realisation, unless it gets its own test gdx added to the package tests, further inflating build times.
  2. Run the climate assessment only from the second iteration on. You mentioned that as a possibility, and it is what we do for the aerosols module. If the assessment does not use any data from postsolve steps after the climate module, then there would not even be any lagged data.
  3. Split remindEmi() into separated functions.
    • There is very little interaction between the calculations for subsector and total emissions, so it is possible.
    • We would be able to run specific parts of the emissions reporting when needed, and ignore the ones which have data missing.
    • 2500 lines. That is 12 % of the entire package.

Option three is the best from a pure code perspective. Option two is the best if we are realistic about our workload.