SDA Approach - Githubissues

vintented commented 4 years ago

Is your feature request related to a problem? Please describe. Integration of SDA calculation function into r2dii.analysis for @2diiKlaus to review.

Describe the solution you'd like The function should ideally take the project results of portcheck and the analysis inputs to calculate the SDA approach for the cement & steel sectors. This should be done for each portfolio in the project. The column B2DS_path_port is the scenario consistent emissions factor necessary to achieve the 2040 target. I should add, the function can work with any scenario theory and that could be added as a function parameter.

Describe alternatives you've considered Adding other sectors with emission factors.

Additional context The plotted output should look something like this.

Can you provide something like this? Yes, it just needs to be confirmed that calculation is correct and code is robust.


calculator <- function(market, port)  {

# this first function finds the emissions factor value for the starting year 
# or a year defined by the user for every port and investor. 
  startender <- function(input, var = Plan.Sec.EmissionsFactor, sectors = c("Cement", "Steel"), date = START.YEAR, scenario = "B2DS") {

    var <- enquo(var)

    output <- input %>% 
      filter(!is.nan(!!var) & Year == date & Scenario %in% scenario & Sector %in% sectors & ScenarioGeography == "Global") %>% 
      rename(CI = !!var) %>% 
      distinct(Investor.Name, Portfolio.Name, CI, Scenario, Sector, Allocation) 

    return(output)
  }

  CI_port <- startender(input = port) #port calculation for starting year 
  CI_market <- startender(input = market) %>% 
    filter(Portfolio.Name == "GlobalMarket") #market calculation for starting year
  SI <- startender(input = market, date = 2040, var = Scen.Sec.EmissionsFactor) %>% 
    filter(Portfolio.Name == "GlobalMarket") %>% 
    rename(SI = CI) #finding the scenario consistent target in 2040 for the market & port

  Distance <- CI_port %>% 
    inner_join(CI_market, by = c("Sector", "Scenario", "Allocation"), suffix = c("_port", "_market")) %>% 
    inner_join(SI, by = c("Sector", "Scenario", "Allocation", "Portfolio.Name_market" = "Portfolio.Name", "Investor.Name_market" = "Investor.Name")) %>% 
    mutate(D_port = CI_port - SI) #find the distance between the port in the start year and the market target

  view <- function(input = port, scenario = "B2DS", sectors = c("Cement", "Steel")) {
    output <- input %>% 
      filter(Scenario == scenario & Sector %in% c("Cement", "Steel")) %>% 
      distinct(Investor.Name, Portfolio.Name, Allocation, Sector, Year, Scen.Sec.EmissionsFactor)
  } #creating sector specific views for cement and steel 

  market <- view(input = market)
  port <- view(input = port)

  port_to_market <- port %>% 
    inner_join(market, by = c("Sector", "Year", "Allocation"), suffix = c("_port", "_market")) %>% 
    inner_join(Distance, by = c("Sector", "Investor.Name_port", "Portfolio.Name_port",  "Investor.Name_market", "Portfolio.Name_market", "Allocation")) %>% 
    mutate(P_market  = (Scen.Sec.EmissionsFactor_market - SI)/(CI_market - SI)) %>% #calculating the rate of change neccesary to achieve the target. 
    mutate(B2DS_path_port = (D_port*1*P_market)+SI) #applying that rate change to find the scenario consistent emissions factor for each year. 

  return(port_to_market) #returning the results

}

port <- readRDS(paste0(PROJ.LOCATION, "40_Results/", Project.Name, "_Equity-PortInput-Port.rda"))
market <- readRDS(paste0(ANALYSIS.INPUTS.PATH, "Equity-Market-Port.rda"))

equity_output <- calculator(market, port)

SDA_sample_output.xlsx

2diiKlaus commented 4 years ago

Thanks a lot Vincent!! Looks good.

see also: Documentation/methodology_document_chapter_sda_approach.Rmd

looks like it works there are only a few minor things that require further checks or adjustments: 1.) @vintented : The target year should be 2050 2.) @vintented : do you know why p_market is not always 1 in the start year? Can be a rounding thing but not sure 3.) @vintented : can you attach the market and port, I want to check some further things: e.g. difference between portweight and ownership as I am surprised that they differ in these sectors 4.) we might want to run other scenarios, for this the scenario needs to become part of the by statement in the port_to_market calculation. @vintented can you adjust for it? 5.) the function should allow for sector selection, e.g. we want to add aviation and potentially shipping at some point and also might want to compare the power results with our metric. @vintented can you add this as an input sector_list and then use that as a filter in the function?

2diiKlaus commented 4 years ago

and as last step the output should use "B2DS_path_port" to overwrite "Scen.Sec.EmissionsFactor_port" and remove all columns that were not present at the beginning in the Port.

Basically it is a way to overwrite the trajectory target by the SDA target

maurolepore commented 4 years ago

Thanks @vintented !

Just for future reference I see:

calculator() is a pure function -- well done!
What might be a more specific name for calculator()? Something that conveys what the function does more specifically?
startender() seems like a convenient helper that might be extracted out of calculator() so you can reuse it elsewhere.
startender() uses tidyeval. This reminds me we should discuss this approach across the organization (https://github.com/2DegreesInvesting/r2dii/issues/5).

vintented commented 4 years ago

@2diiKlaus

I did not put 2050 for the time being because in the market data we do not have scenario data past 2040, but an easy enough thing to fix.
Yes, I would assume most logical it is a rounding error as well.
Here is the market path and the port [path](/Users/vincentjerosch-herold/Dropbox (2° Investing)/PortCheck_v2/10_Projects/FASECOLDA_META/40_Results/FASECOLDA_META_Equity-PortInput-Port.rda). You also find debt in the same folders.
Yes, I will add that flexibility to the function. Also including all of your additional comments.

vintented commented 4 years ago

@2diiKlaus & @maurolepore

Hi, adding the latest version of the function that reflects your inputs. Let me know if I can make any additional adjustments.

sda_calculation <- function(market, port, ref_sectors = c("Cement", "Steel"), ref_scenario = "SDS", start_year = 2019, target_year = 2050)  {

  startender <- function(input, var = Plan.Sec.EmissionsFactor, sectors = ref_sectors, year = start_year, scenario = ref_scenario) {

    var <- enquo(var)

    output <- input %>% 
      filter(!is.nan(!!var) & Year == year & Scenario %in% scenario & Sector %in% sectors & ScenarioGeography == "Global") %>% 
      rename(CI = !!var) %>% 
      distinct(Investor.Name, Portfolio.Name, CI, Scenario, Sector, Allocation) 

    return(output)
  }

  CI_port <- startender(input = port) 
  CI_market <- startender(input = market) %>% 
    filter(Portfolio.Name == "GlobalMarket")
  SI <- startender(input = market, year = target_year, var = Scen.Sec.EmissionsFactor) %>% 
    filter(Portfolio.Name == "GlobalMarket") %>% 
    rename(SI = CI)

  Distance <- CI_port %>% 
    inner_join(CI_market, by = c("Sector", "Scenario", "Allocation"), suffix = c("_port", "_market")) %>% 
    inner_join(SI, by = c("Sector", "Scenario", "Allocation", "Portfolio.Name_market" = "Portfolio.Name", "Investor.Name_market" = "Investor.Name")) %>% 
    mutate(D_port = CI_port - SI) 

  view <- function(input = port, scenario = ref_scenario, sectors = ref_sectors) {
    output <- input %>% 
      filter(Scenario == scenario & Sector %in% ref_sectors) %>% 
      distinct(Investor.Name, Portfolio.Name, Allocation, Sector, Year, Scen.Sec.EmissionsFactor)
  }

  market <- view(input = market, scenario = ref_scenario, sectors = ref_sectors)
  port <- view(input = port)

  port_to_market <- port %>% 
    inner_join(market, by = c("Sector", "Year", "Allocation"), suffix = c("_port", "_market")) %>% 
    inner_join(Distance, by = c("Sector", "Investor.Name_port", "Portfolio.Name_port",  "Investor.Name_market", "Portfolio.Name_market", "Allocation")) %>% 
    mutate(P_market  = (Scen.Sec.EmissionsFactor_market - SI)/(CI_market - SI)) %>% 
    mutate(Scen.Sec.EmissionsFactor_port = (D_port*1*P_market)+SI) 

  return(port_to_market)

}

debt_output <- sda_calculation(market = debt_market, port = debt_port, 
                          ref_sectors = c("Cement", "Steel"), ref_scenario = "B2DS", 
                          start_year = 2019, target_year = 2040)

maurolepore commented 4 years ago

Well done @vintented.

Can you please add a reproducible example (reprex: https://reprex.tidyverse.org/)? I need to not only see the code but also be be able to run it. Eventually we'll use such an example in the #' @examples section of your function's help file, and in your function's unit tests (I'll explain and show you at the time).

You may need to create toy data. Ideally, the data you provide for the examples should be public and as small as possible to illustrate the features of your function. For example, if I wanted to demonstrate sum():

# good -- just enought 
sum(1 + 1)
#> [1] 2

# bad   -- too complicated
sum(912954 + 19982)
#> [1] 932936

^{Created on 2019-12-02 by the reprex package (v0.3.0)}

After that I can help you submit a pull request from your fork. That way I can add commits to your pull request to show rather than tell you what changes I suggest.

vintented commented 4 years ago

SampleSDA_data.xlsx

@maurolepore Here is a sample dataset with only dummy essentials. Let me know if you are looking for something else. I look forward to learning more about the next steps of turning this into a proper packaged function.

2diiKlaus commented 4 years ago

Thanks @vintented. Looks good to me.

There are 2 things where I need to decide what we do about it 1) Currently the ScenarioGEography filter is hard-coded to "Global", which might be fine. But there might be cases where people want to look at other geographies. Thus we would a) need to join using "ScenarioGeography" as well and remove the filter in the startender function 2) Currently also the market is hard-coded to "Global Market" inside the function. I was thinking of rather testing the "uniqueness" of the market portfolio in terms of Portfolio.Name rather than filtering for Global Market and thus keeping it flexible and less error-prone (in case someone changes the name of the Portfolio.Name in the market for whatever reason

I need to make a call here, but want to get other peoples view on this as I do not want to add flexibility that we do not need.

There are 2 things you can do directly @vintented : 1) Update the sample datasets for Mauro: The market and portfolio files do not have the scenario geography as a column at all, I suspect that the function would fail to run with your sample data sets. one question @maurolepore We probably use the market data also for other functions. Vincent already filtered it down to only the relevant columns for this function. Would it be better to have 1 dataset with all columns that can then be used for any function using the market or do you prefer to have a separate sample dataset for each function? 2) Map the output to the input! The output should be exactly like the portfolio input and just overwrite the "Scen.Sec.EmissionsFactor" with the newly calculated "Scen.Sec.EmissionsFactor_port", the rest should be the same: same columns, etc.

Feel free to reach out to me on Slack!

maurolepore commented 4 years ago

RE (@2diiKlaus) "the market data":

Would it be better to have 1 dataset with all columns that can then be used for any function using the market or do you prefer to have a separate sample dataset for each function?

I think having a single market dataset would be best (easier to maintain and easier to use). But I think it's okay for @vintented to now provide exclusively the columns that this function needs, and we can add more columns whenever we need them.

If the market dataset is useful beyond the scope of r2dii.analysis I think we should eventually move it to r2dii.dataraw where we describe generic datasets and store similar *_demo datasets for examples and tests.

Relates to PR https://github.com/2DegreesInvesting/r2dii.analysis/pull/8.

vintented commented 4 years ago

@maurolepore @2diiKlaus

I have updated the function below to reflect Klaus's comments and added a new version of the sample data. Now the output is the same as the input data frame with only the Scen.Sec.EmissionsFactor updated.

SampleSDA_data.xlsx

sda_calculation <- function(market_data, port_data, ref_sector = c("Cement", "Steel"), ref_scenario = "B2DS", ref_geography = "Global", start_year = 2019, target_year = 2040)  {

  startender <- function(input_data, var = Plan.Sec.EmissionsFactor, year = start_year) {

    var <- enquo(var)

    output_data <- input_data %>% 
      filter(!is.nan(!!var) & Year == year & Scenario %in% ref_scenario & Sector %in% ref_sector & ScenarioGeography %in% ref_geography) %>% 
      rename(CI = !!var) %>% 
      distinct(Investor.Name, Portfolio.Name, CI, Scenario, ScenarioGeography, Sector, Allocation) 

    return(output_data)
  }

  CI_port <- startender(input_data = port_data) 
  CI_market <- startender(input_data = market_data)
  SI <- startender(input_data = market_data, year = target_year, var = Scen.Sec.EmissionsFactor) %>% 
    rename(SI = CI)

  Distance <- CI_market %>% 
    inner_join(SI, by = c("Sector", "Scenario", "Allocation", "Portfolio.Name",  "Investor.Name", "ScenarioGeography")) %>% 
    inner_join(CI_port, by = c("Sector", "Scenario", "Allocation", "ScenarioGeography"), suffix = c("_market", "_port")) %>% 
    mutate(D_port = CI_port - SI) 

  view <- function(input_data = port_data) {

    output_data <- input_data %>% 
      filter(Scenario %in% ref_scenario & Sector %in% ref_sector & ScenarioGeography %in% ref_geography) %>% 
      distinct(Investor.Name, Portfolio.Name, Allocation, Sector, Scenario, ScenarioGeography, Year, Scen.Sec.EmissionsFactor)

    return(output_data)
  }

  market_view <- view(input_data = market_data)
  port_view <- view(input_data = port_data)

  port_to_market <- market_view %>% 
    select(-c(Investor.Name, Portfolio.Name)) %>% 
    inner_join(port_view, by = c("Sector", "Year", "Allocation", "ScenarioGeography", "Scenario"), suffix = c("_port", "_market"))

  port_to_distance <- port_to_market %>% 
    inner_join(Distance, by = c("Scenario", "Sector", "Investor.Name" = "Investor.Name_port", "Portfolio.Name" = "Portfolio.Name_port", "Allocation", "ScenarioGeography")) 

  port_calculation <- port_to_distance %>% 
    mutate(P_market  = (Scen.Sec.EmissionsFactor_market - SI)/(CI_market - SI),
           Scen.Sec.EmissionsFactor = (D_port*1*P_market)+SI)

  port_calculation <- port_calculation %>% 
    select(Investor.Name, Portfolio.Name, Allocation, Sector, Scenario, ScenarioGeography, Year, Scen.Sec.EmissionsFactor)

  port_data <- port_calculation %>% 
    right_join(port_data, by = c("Investor.Name", "Portfolio.Name", "Allocation", "Scenario", "Sector", "ScenarioGeography", "Year"), suffix = c("", "_sda"))

  port_data <- port_data %>% 
    mutate(Scen.Sec.EmissionsFactor = if_else(!is.na(Scen.Sec.EmissionsFactor_sda), Scen.Sec.EmissionsFactor_sda, Scen.Sec.EmissionsFactor)) %>% 
    select(-Scen.Sec.EmissionsFactor_sda)

  return(port_data)

}

2diiKlaus commented 4 years ago

The latest code looks good @vintented Can you adjust the #8 accordingly. To me it looks like the last changes were not implemented yet. Maybe I am missing something, then I might need help by @maurolepore I checked those files: https://github.com/2DegreesInvesting/r2dii.analysis/pull/8/files

Once this is done I will test it with portfolios.

Thanks @maurolepore and @vintented

2diiKlaus commented 4 years ago

I guess once it is implemented in the PR, we can close this issue and just continue the discussion in one thread. Do you agree @maurolepore?

maurolepore commented 4 years ago

Yup, agree -- the discussion follows at #10

RMI-PACTA / r2dii.analysis

SDA Approach #6