pik-piam / remind2

The remind2 package contains the REMIND-specific routines for data and model output manipulation.
0 stars 40 forks source link

Drop Backwards Compatibility #524

Open 0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q opened 7 months ago

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 7 months ago

tl;dr

Backwards compatibility was introduced into the remind2 package (and before that in the remind package) when we did not have a way to properly manage R package versions on the cluster, and everybody and every project used the common library of R packages which was (is?) updated whenever a new version of a RSE package is merged on Github.

But now projects can easily manage their R package environments and keep it stable with whatever package versions they need and that works for them. Updates to the remind2 package are usually tied to updates in the REMIND code. Either projects track the REMIND develop branch to benefit from the REMIND updates — then they will want (need) to get new remind2 versions, too. Or they do not track the REMIND developments, in which case there is no point for them to update the remind2 package, either.

Currently, we are actively tracking compatibility with five projects

all of which closely tracked REMIND development and did not diverge far from it. So there is no benefit for them from updated remind2 package versions, either.

All the backwards compatibility leads to a Byzantine code structure that nobody can follow anymore, resulting in increasing complexity and errors, as well as excessive test times (around 30 minutes currently).

I suggest to ditch the backwards compatibility, and focus on a correctly working post-processing for current REMIND code instead. Define a set of module realisations and switches that have to be supported, generate up-to-date gdxes for them as part of the AMTs, and test for the existence and plausibility of variables and values in the output.

orichters commented 7 months ago

I agree that we should reduce the number of gdx files, but I am convinced that the check that remind2 generates the variables the project mappings expect is very useful. They only take seconds, once the mif data is there.

My suggestion would be to go from 6 to 2 gdx files:

That should cover everything that these tests are intended for, while reducing runtime by ~2/3.

fbenke-pik commented 7 months ago

I am not sure I fully understand what exactly you want to get rid of, Michaja. But in general, I agree that there is potential.

Just a few quick thoughts that come to my mind:

fbenke-pik commented 7 months ago

tagging @Renato-Rodrigues, as he was one of the driving forces behind introducing the project-specific reporting validation

mellamoSimon commented 7 months ago

thank you for this effort! one more thing that I think could really improve the readability and transparency of the emissions and energy reporting (at least) is to move all variable calculations to remind and let the reporting functions in remind2 actually just report the variables. This is not an alternative to the proposal here but a good complement, in my opinion.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 7 months ago
  • I am just not sure if two gdxes are sufficient to cover project-specific switches, which seem to be a thing in REMIND (but dont know enough about it to judge).

That would fall under what I termed "define a set of module realisations that have to be supported" and we should test for that, if indeed it is the case. I updated the original text to reflect that.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 7 months ago

I personally am not a fan of the conditional reporting as it is currently done for Ariadne in reportEmi. I think it was supposed to be a temporary hotfix, but it has been in the code forever and is causing problems. Can we think of a better way to achieve the same? Or drop it altogether?

Probably we should go and find out what those differences are. In general, I am fine with supporting things that are in the current REMIND code. I just do not see any point in keeping post-processing code around for REMIND code that has long since been removed, especially when people can use renv to run the old reporting with ease.

orichters commented 7 months ago

NAVIGATE and SHAPE are over, right? So why can't we drop the compatibility tests for the projects?

You have to differentiate. The projects are over, so you need not run tests based on a gdx from these projects anymore.

But: The mapping templates produced in these projects (saved in piamInterfaces) are still in use. For example the newly started ELEVATE project uses the NAVIGATE template, which serves also as a basis for the new community template (in preparation for AR7). AR6 will be used for NGFS September 2024 release.

That is why we should continue running these tests on the templates. ELEVATE + NGFS configs are close enough to our AMT H12 runs, so it is fine to just use these gdx files.

bs538 commented 7 months ago

For SHAPE: indeed the project is officially done, but the papers are only in the making, so while they are in review and until the final scenario DB is published (guessing in around 6 months), there is still the chance that we need to run either additional scenarios, or that we need to fix reporting issues in the current scenarios. Guessing it will be similar for other projects. So fully agree on the principal goal to de-clutter, but I think we still need to ensure a mechanism for being able to regenerate corrected reportings for fairly recent REMIND runs (in the case of SHAPE, May 2023) at least for some time after the runs were made. Not sure how this would work in the current R library management setup if backwards compatibility was ditched, but happy to learn about it if there is a good solution

Renato-Rodrigues commented 7 months ago

I would be in favor of a reduction of the gdxs tested, but I am not 100% with the option of removing them entirely.

This feature was NOT created originally to keep backwards compatibility.

The intention was to make sure, whenever somebody commits a reporting change and only test their changes to their personal module and regional configuration, that they don't break the report for other combinations.

Before, it was very common that somebody only worked with for example H12 default REMIND modules and committed changes to the reporting library that break the reporting for EU21 regions, or for other commonly used module realisations, e.g. NDC, NPi, regipol, ...

My suggestion is to, therefore keep at least two gdxs active, both reflecting AMT policy scenarios, one using the H12 regions configuration and one using the EU21.

fschreyer commented 7 months ago

My two cents on that.

1.) I am generally in favor of de-cluttering the if-clauses in the reporting and not keeping backwards compatibility forever. However, I do think it can be useful to use such if-clauses in the development / merge phase precisely because significant REMIND developments often require reporting library changes and it provides more safety and stability after the merge and avoids people having to try to find the right combinations of REMIND and reporting library version that still work together.

2.)

I personally am not a fan of the conditional reporting as it is currently done for Ariadne in reportEmi. I think it was supposed to be a temporary hotfix, but it has been in the code forever and is causing problems. Can we think of a better way to achieve the same? Or drop it altogether?

Well, I introduced that in Ariadne and I would say that is not causing trouble per se but that the trouble we had with it during the recent merges is manily because of the general complexity of the emissions reporting code. I am not in favor of doing such calculations in some repos outside the reporting library as it might always be that others want to use your developments as well. A possible alternative is to do project-specific variables / developments in separate functions like reportSDP. Regarding the complexity of the emissions reporting, which I feel responsible for, I side with Simon that it needs a refactoring at some point. To clean it up a bit even before, I am intending to remove some of the backwards compatibility if-clauses in the next months once the current REMIND version is stable and tested a bit longer.

orichters commented 7 months ago

For NGFS, ELEVATE and SHAPE, @bs538 and I came to the conclusion that is sufficient to check the variable list in piamInterfaces against the gdx file used for testing convGDX2MIF, so we don't need our own. I implemented that in https://github.com/pik-piam/remind2/pull/534. I also adjusted the code slightly such that the most recent AMT gdx will be tested as well. Overall, we have one less gdx to be converted as of now. Whoever feels responsible for NAVIGATE might go the same way.

orichters commented 6 months ago

@fbenke-pik: Do you see any way of automatically and regularly uploading the newest AMT gdx file(s) to the rse server, or finding another way people can easily access it?

fbenke-pik commented 6 months ago

@fbenke-pik: Do you see any way of automatically and regularly uploading the newest AMT gdx file(s) to the rse server, or finding another way people can easily access it?

Should be possible. But we have limited space on the RSE server. Not sure what you mean by "easily access it" and for what purpose.

orichters commented 6 months ago

Should be possible. But we have limited space on the RSE server. Not sure what you mean by "easily access it" and for what purpose.

You could overwrite the old one, so limited space shouldn't be a problem. If this works, no need to search for alternatives, I guess. The aim would be to allow (well, rather force) everyone to test the reporting with the latest AMT gdx using the existing remind2 infrastructure.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 6 months ago

I will recap the discussion (as I see it) so far, in order to identify issues that need clarification and discussion. Most commentators discussed the number of gdxes we test against – which is only a secondary point to the issue I was trying to raise. My main argument was to drop the pretence (we never test this, we have no idea if it is true) to support old REMIND versions ad infinitum.

  1. Reduce the number of gdxes to test against:

    • There is general agreement, as long as ongoing projects are still supported.
    • There is agreement that both twelve- and 21-region gdxes need to be tested, preferably of the latest AMT runs.
    • Specific tests exist for Ariadne, ECEMF and NAVIGATE.
    • [ ] Do they have specific reporting code? Do we know where it is?
    • For other projects (NGFS, ELEVATE, SHAPE) it suffices to check only the existence of required variables. []
  2. Reduce code complexity by dropping backwards compatibility:

    • The conditional Emissions reporting for Ariadne will be reduced/cleaned by Felix at some point not too distant in the future. []
    • Those who commented on backwards compatibility specifically (@bs538, @nicobauer chimed in during the REMIND meeting, but did not deign to comment on the issue) voiced apprehension of ongoing projects being cut off from bug fixes in the reporting

      I think we still need to ensure a mechanism for being able to regenerate corrected reportings for fairly recent REMIND runs (in the case of SHAPE, May 2023) at least for some time after the runs were made. Not sure how this would work in the current R library management setup if backwards compatibility was ditched, but happy to learn about it if there is a good solution

I suggest to separate the issue of gdx tests, which seems to progress on its own (maybe because remind2 build times are their own pressure point).

On backwards compatibility: Do people prefer a meeting to discuss this, or should we keep collecting data and opinions first?

I certainly see some questions that could use some hard answers:

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 6 months ago

So fully agree on the principal goal to de-clutter, but I think we still need to ensure a mechanism for being able to regenerate corrected reportings for fairly recent REMIND runs (in the case of SHAPE, May 2023) at least for some time after the runs were made. Not sure how this would work in the current R library management setup if backwards compatibility was ditched, but happy to learn about it if there is a good solution

There is no good way around this. Projects should use fixed version of R packages that match the REMIND code they are using. Unless you update REMIND code, do not update the R packages.
If there is a bug fix in the R package code, the proper way to get it to the old package version would be by cherry-picking that specific bug fix to a new package version with a patch level. So, for example you are using remind2 v1.111.1 (30 May 2023) and you get some bug fix upstream today (v1.135.3), you would cherry-pick that fix onto the old code, creating v1.111.1-1. The pain of cherry-picking depends on the quality of the code base (abysmal in our case, see below), and the quality of the git commits (e.g. put distinct changes into distinct commit, do not mix it up with reformatting stuff, increasing version numbers and so on). I think that cleaner code structure would also allow for better bug fixes that can be ported to older code branches.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 6 months ago

To motivate this issue a bit further, here are the cyclomatic complexity [] and code lengths of the worst remind2 functions. Every single time somebody wants to add something to reportEmi(), they are stepping into a mine field. And a whole lot of that code is of the format

if (!is.null(the_variable_used_for_three_REMIND_versions)) {
    # 100 lines of current code
} else {
    # 50 lines of unused legacy code
}
function cyclomatic complexity lines of code
reportEmi 108 1719
reportPrices 89 938
reportFE 54 1960
reportTechnology 47 332
reportLCOE 45 950
compareScenConf 41 135
plotNashConvergence 40 416
plotCDR 33 311
reportCosts 27 685
reportEmiAirPol 27 661
reportMOFEX 27 248
reportMacroEconomy 25 292
reportTax 25 242
toolRegionSubsets 25 52
reportSDPVariables 24 139
runEmployment 23 207
reportEmployment 22 216
reportSE 21 483
orichters commented 6 months ago

Specific tests exist for Ariadne, ECEMF and NAVIGATE.

I briefly discussed with Jess and she is fine that the NAVIGATE-specific test can be removed in the middle of the year once the last paper is submitted.

While I would like our code structure and commit discipline as clean as the ideal you describe, I guess that is unreasonable to assume. Therefore, I would strongly opt for keeping and testing backward compatibility at least to a gdx based on the last REMIND release that is still used by any project. If we don't do that, I fear we would need to maintain a new remind2 branch with relevant bug fixes for the release version that is used by many projects.

Maybe we can find a nice way of moving backward compatibility code into own subfunctions, such that the code becomes more like

if (! is.null(the_variable_used_for_three_REMIND_versions)) {
    out <- mbind(out, reportCDR(whateverdata))
} else {
    out <- mbind(out, reportCDR_before_2023_02_12(whateverdata))
}

And reportCDR_before_2023_02_12 could then have defined in the description a moment when it can be kicked out. Overall, I think moving things into smaller functions would be a big boon for the package.

bs538 commented 6 months ago

Thanks for adding structure to the discussion Michaja, @0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q , and sorry for being slow to respond (paper submission + project proposal rush...).

In an ideal world (clean code-base), Michaja's suggestion of applying only specific cherry-picked fixes to correct reporting problems (and otherwise stick with matching REMIND + R library version) seems like the way to go. In practice, I'm also worried that this would be considerable pain.

So from my side, + 1 for the suggestion by @orichters to peg it to REMIND releases. The last major release (v3.2.0 currently) should always be supported in my view, and additionally we can keep track of which project uses which release for the decision when to ditch backwards-compatibility to that version. For SHAPE I used v3.2.0.

(I'll be off to parental leave from next week, will re-connect with the discussion in May if it's still open then.)

orichters commented 3 months ago

With https://github.com/pik-piam/remind2/pull/600, we are now down to 2 gdx files that are tested, covering "regional resolution (H12 and EU21), REMIND version (release and AMT), and climate policy (NPi and PkBudg)."

I have the impression that there is broad agreement to support the latest release, but not longer into the past, so I think we could start cleaning up some stuff in the report functions.