Integrate scenario target table interpolation into the RECC model

stefanpauliuk commented 1 year ago

@CarrerF suggested to generate the interpolations of the scenario target table on the fly, and turing @TomerFishman 's interpolation script into an ODYM function. This would reduce the parameter count by about 30 and would make the database more compact, plus that it would save a lot of time that is spent for copying these interpolated data to the RECC database.

@stefanpauliuk added that next to implementing this automated target table interpolation, also the interpolation of other parameters should be automized and auto-executed by setting appropriate control flags in the config file, calling a general interpolation function, and not by hand-coded interpolations in the main model script.

CarrerF commented 1 year ago

stefanpauliuk commented 1 year ago

Thanks @CarrerF that will be a major improvement to the current status!

For the coding strings: please add the interpolation method, which can be forwarded to the numpy interpolation routine directly as string: like interpolate_c_1900_2000_linear or interpolate_c_1900_2000_spline

For your suggestion that the Scenario target files should be somehow assembled together to allow for comparison/overview/consistent development: I think there is potential for another ODYM function that goes both ways: (1) Scan the current and target values for all scenario-dependent parameters for a given sector (pav/reb/nrb/....) from the individual parameter tables and assemble then on a larger Excel sheet/canvas, where modifications can be made (2) Transfer the updated scenario target values back from the canvas to the individual parameter files. This canvas differs from the current scenario target table in that it is much more compact and that the compact target tables for each sector are assembled on one sheet each, whereas any data source and ancillary calculations are documented in the individual paremeter files.

CarrerF commented 1 year ago

I have not fully understood the translation of target tables into canvas.

After our last conversation, I had the impression that this was the optimal set-up for target-based parameters:

The target-tables excel is discarded and each dataset is already in the database, with only current and target values. The interpolation method eventually allows for intermediate values.

Would you consider something different?

chauenstein commented 1 year ago

This looks very nice @CarrerF Thanks! Is it possible to choose the values for 't' flexibly? E.g., give a target value for the year 2040, which remains constant thereafter? Additionally, it would be great to have the possibility of using intermediate steps (e.g., if I want to provide annual values for the start of the period, in order to calibrate the model to the historic development (e.g. 2016-2020)). Is that already possible?

CarrerF commented 1 year ago

Hi @chauenstein ,

Yes to both! I realized those features were useful only while I was coding the methods, so they are not in the outline above. However, I now uploaded better documentation in RECC_v2_5_v2_6 > documentation >Transparent data processing in RECC.pptx.

In your case, time interpolation will happen within the boundaries given in the method, i.e. interpolate_t_2020_2040. You could then use the method called 'copy', which replicates a value. So, would be a reasonable input. In the config, you would then set the methods: ['_interpolate_t_2020_2040_spline', 'copy_t_2040to[2041:2060]'] (linear in alternative to spline).

Eventually, you could also consider an intermediate interpolation step, e.g. for 2030. The notation will still be 'interpolate_t_2020_2040_spline', and the code will automatically look for interpolation knots.

The methods will be available once Stefan merges the requests on odym and odym_recc.

stefanpauliuk commented 1 year ago

Thanks a lot @CarrerF this is very advanced and will be extremely helpful! Also great that you can split into interpolation and copying/replication. Eager to try it out and check after my vacation. For @chauenstein , it comes in very handy, too, as he is about to launch a larger scenario formulation activity for the EU country building sectors.

Thanks also for documenting it in the pptx! A few questions here:

"If the element has an empty space in the classification, so it will when called" --> Whata do you mean by that, please rephrase!

"Some of the interpolation_target_tables had benchmark values for 2030,2040,2050,…. Since they were not documented, I only kept the final target. Hence time-series are slightly different. Eventually, benchmark values could be added as interpolation knots in the new templates too." These values are mostly used for fine-tuning the dynamic stock model (esp. for 2_S) to avoid jumps in the flows. @CarrerF could you please transfer these values for the 1_F and 2_S parameters (main drivers for passenger vehicle, residential buildings, and non-residential buildings)?

General, regarding table format: For the scenario target tables we have the parameter type 'Table'. The table has X row aspects (X contains at least t, but can also be tr, or trG, as in the example that you copied above. No limits here! The interpolation is done for the aspect specified in the string (here: t). The typical table also has one column aspect (S), but for the general case, there can be any number of aspects, right?

It's really a cool feature that makes thinks much clearer, good to get rid of all these hand-wired interpolations...

CarrerF commented 1 year ago

"If the element has an empty space in the classification, so it will when called" It's the case of elements like "concrete aggregates", with a space in them. It does not relate to the aspects that we are replicating right now, because scenarios LED, SSP1 and SSP2 are 1-world only, but I made the code flexible for future use. If, for example, one would like to replicate the values of cement for concrete aggregates in a certain parameter, the notation would be replicate_m_concrete aggregates_with_cement, keeping the space.

"could you please transfer these values for the 1_F and 2_S" ok!

"The typical table also has one column aspect (S), but for the general case, there can be any number of aspects, right?" Yes, the number of aspects in the parameter is arbitrary, as well as the position of t/c in the Index Structure. The data format list/table does not relate to the methods. Once ReadParameterXLSX has created the multi-dimensional array of Values, the methods replicate/interpolate/copy act on it, and look for the index position in the parameter's Index Structure. This should cover any general case. For example. the interpolation method is currently used on many parameters with structures "GrtS" or variations ("stGr", "VrtS", "BrtS", "NrtS", "srtS", "prtS", "urtS", "GrtS"), "cmpr" (also "cmBr", "cmNr") or simply "c" (for 6_MIP_GWP_Bio). But one could, for example, use the interpolation method for 3_SHA_EnergyCarrierSplit_Buildings, whose structure is "VRrnt". An example of a 2-rows and 2-columns table is 2_P_RECC_Population_SSP_32R_V2.2. If for a country you had only historic data and a future projection, then 'interpolate_t_2023_2060_linear' would work. Eventually, it could also be combined: ['interpolate_t_2023_2060_linear',replicate_S_LED_with_SSP2'].

stefanpauliuk commented 10 months ago

Interpolation controlled by config file is now routinely working, great advancement!

IndEcol / RECC-ODYM

Integrate scenario target table interpolation into the RECC model #53