pik-piam / remind2

The remind2 package contains the REMIND-specific routines for data and model output manipulation.
0 stars 40 forks source link

avoid restoring zeros in reporting #557

Closed fbenke-pik closed 5 months ago

fbenke-pik commented 6 months ago

This PR avoids using the memory-intense option restoreZeros in reportEmi and reportFE to read in data, mostly by introducing new options when reading in data from gdxes.

I tested with various gdxes from previous runs that reporting does not break and summation gaps did not widen through this.

mellamoSimon commented 5 months ago

great, thank you for this, Falk! which remind runs did you use to run your tests? we introduced some changes within remind that affect those variables and it would be great to have a comparison that includes those changes in the resulting gdxs and also be able to compare ceteris paribus, basically to make sure that the differences you see in the summation checks you mentioned here are not coming from this changes. Or maybe you checked already and I didn't get it? thank you!

mellamoSimon commented 5 months ago

we introduced some changes within remind that affect those variables and it would be great to have a comparison that includes those changes

this is the latest one https://github.com/remindmodel/remind/pull/1599

also tagging @0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q and @fschreyer so they can stay on top of things :)

fbenke-pik commented 5 months ago

I tested with these runs

  "SDP_MC-NPi-AMT_2024-03-01_22.13.42",
  "SSP2EU_PBS-NDC-AMT_2024-03-02_04.26.14",
  "SSP2EU_PBS-NPi-AMT_2024-03-01_22.14.22",
  "SSP2EU_PBS-PkBudg650-AMT_2024-03-02_04.26.25",
  "SSP2EU-EU21-PkBudg1050-AMT_2024-03-02_17.16.02",
  "SSP2EU-EU21-PkBudg500-AMT_2024-03-02_17.15.46",
  "SSP2EU-EU21-PkBudg650-AMT_2024-03-02_17.15.54",
  "SSP2EU-NDC-AMT_2024-03-02_06.05.42",
  "SSP2EU-NPi-AMT_2024-03-01_22.11.59",
  "SSP2EU-PkBudg1050-AMT_2024-03-02_06.06.08",
  "SSP2EU-PkBudg500-AMT_2024-03-02_06.05.52",
  "SSP2EU-PkBudg650-AMT_2024-03-02_06.06.01",
  "SSP5-NPi-AMT_2024-03-01_22.13.21",
  "SSP5-PkBudg650-AMT_2024-03-02_06.06.32"

I see changes in summations, but all are improvements and they are most likely due to some of your fixes which were also introduced since the original reporting was run.

Let me know if I should test with other runs as well.

mellamoSimon commented 5 months ago

maybe Michaja @0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q has managed to get some recent runs?

fbenke-pik commented 5 months ago

Honestly, I am at a loss as to why this works. Simón said he required restore_zeros = TRUE during development [↑] and there is no material change in the code. This

mostly by introducing new options when reading in data from gdxes.

actually only affects pm_IndstCO2Captured (and does not work there, so it needs to be changed).

I guess readGDX(restore_zeros = FALSE) works for the variables because they have non-zero marginals across ttot, but if that was not the case earlier (see above) it might change again in the future. ¯\_(ツ)_/¯

I introduced the spatial argument whenever sth. crashed after setting restore_zeros = F (usually, when the only value left for emission market in the read in data was "ETS", which makes magclass think, it must be a regional column).

If we cannot reproduce the things that made this crash in the past, I'd say we go ahead and merge. The moment at happens again, I will work on fixing it. It is just pretty difficult to fix sth. I cannot reproduce.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 5 months ago

I introduced the spatial argument whenever sth. crashed when testing with some gdxes (usually, when the only value for emission market in the read in data was "ETS", which makes magclass think, it must be a regional column), so should be more than one problem.

Indeed, my eyes glazed over when looking at the diff.