Closed vintented closed 4 years ago
Thanks a lot Vincent!! Looks good.
see also: Documentation/methodology_document_chapter_sda_approach.Rmd
looks like it works there are only a few minor things that require further checks or adjustments: 1.) @vintented : The target year should be 2050 2.) @vintented : do you know why p_market is not always 1 in the start year? Can be a rounding thing but not sure 3.) @vintented : can you attach the market and port, I want to check some further things: e.g. difference between portweight and ownership as I am surprised that they differ in these sectors 4.) we might want to run other scenarios, for this the scenario needs to become part of the by statement in the port_to_market calculation. @vintented can you adjust for it? 5.) the function should allow for sector selection, e.g. we want to add aviation and potentially shipping at some point and also might want to compare the power results with our metric. @vintented can you add this as an input sector_list and then use that as a filter in the function?
and as last step the output should use "B2DS_path_port" to overwrite "Scen.Sec.EmissionsFactor_port" and remove all columns that were not present at the beginning in the Port.
Basically it is a way to overwrite the trajectory target by the SDA target
Thanks @vintented !
Just for future reference I see:
calculator()
is a pure function -- well done!calculator()
? Something that conveys what the function does more specifically?startender()
seems like a convenient helper that might be extracted out of calculator()
so you can reuse it elsewhere.startender()
uses tidyeval. This reminds me we should discuss this approach across the organization (https://github.com/2DegreesInvesting/r2dii/issues/5).@2diiKlaus
@2diiKlaus & @maurolepore
Hi, adding the latest version of the function that reflects your inputs. Let me know if I can make any additional adjustments.
sda_calculation <- function(market, port, ref_sectors = c("Cement", "Steel"), ref_scenario = "SDS", start_year = 2019, target_year = 2050) {
startender <- function(input, var = Plan.Sec.EmissionsFactor, sectors = ref_sectors, year = start_year, scenario = ref_scenario) {
var <- enquo(var)
output <- input %>%
filter(!is.nan(!!var) & Year == year & Scenario %in% scenario & Sector %in% sectors & ScenarioGeography == "Global") %>%
rename(CI = !!var) %>%
distinct(Investor.Name, Portfolio.Name, CI, Scenario, Sector, Allocation)
return(output)
}
CI_port <- startender(input = port)
CI_market <- startender(input = market) %>%
filter(Portfolio.Name == "GlobalMarket")
SI <- startender(input = market, year = target_year, var = Scen.Sec.EmissionsFactor) %>%
filter(Portfolio.Name == "GlobalMarket") %>%
rename(SI = CI)
Distance <- CI_port %>%
inner_join(CI_market, by = c("Sector", "Scenario", "Allocation"), suffix = c("_port", "_market")) %>%
inner_join(SI, by = c("Sector", "Scenario", "Allocation", "Portfolio.Name_market" = "Portfolio.Name", "Investor.Name_market" = "Investor.Name")) %>%
mutate(D_port = CI_port - SI)
view <- function(input = port, scenario = ref_scenario, sectors = ref_sectors) {
output <- input %>%
filter(Scenario == scenario & Sector %in% ref_sectors) %>%
distinct(Investor.Name, Portfolio.Name, Allocation, Sector, Year, Scen.Sec.EmissionsFactor)
}
market <- view(input = market, scenario = ref_scenario, sectors = ref_sectors)
port <- view(input = port)
port_to_market <- port %>%
inner_join(market, by = c("Sector", "Year", "Allocation"), suffix = c("_port", "_market")) %>%
inner_join(Distance, by = c("Sector", "Investor.Name_port", "Portfolio.Name_port", "Investor.Name_market", "Portfolio.Name_market", "Allocation")) %>%
mutate(P_market = (Scen.Sec.EmissionsFactor_market - SI)/(CI_market - SI)) %>%
mutate(Scen.Sec.EmissionsFactor_port = (D_port*1*P_market)+SI)
return(port_to_market)
}
debt_output <- sda_calculation(market = debt_market, port = debt_port,
ref_sectors = c("Cement", "Steel"), ref_scenario = "B2DS",
start_year = 2019, target_year = 2040)
Well done @vintented.
Can you please add a reproducible example (reprex: https://reprex.tidyverse.org/)? I need to not only see the code but also be be able to run it. Eventually we'll use such an example in the #' @examples
section of your function's help file, and in your function's unit tests (I'll explain and show you at the time).
You may need to create toy data. Ideally, the data you provide for the examples should be public and as small as possible to illustrate the features of your function. For example, if I wanted to demonstrate sum()
:
# good -- just enought
sum(1 + 1)
#> [1] 2
# bad -- too complicated
sum(912954 + 19982)
#> [1] 932936
Created on 2019-12-02 by the reprex package (v0.3.0)
After that I can help you submit a pull request from your fork. That way I can add commits to your pull request to show rather than tell you what changes I suggest.
@maurolepore Here is a sample dataset with only dummy essentials. Let me know if you are looking for something else. I look forward to learning more about the next steps of turning this into a proper packaged function.
Thanks @vintented. Looks good to me.
There are 2 things where I need to decide what we do about it 1) Currently the ScenarioGEography filter is hard-coded to "Global", which might be fine. But there might be cases where people want to look at other geographies. Thus we would a) need to join using "ScenarioGeography" as well and remove the filter in the startender function 2) Currently also the market is hard-coded to "Global Market" inside the function. I was thinking of rather testing the "uniqueness" of the market portfolio in terms of Portfolio.Name rather than filtering for Global Market and thus keeping it flexible and less error-prone (in case someone changes the name of the Portfolio.Name in the market for whatever reason
I need to make a call here, but want to get other peoples view on this as I do not want to add flexibility that we do not need.
There are 2 things you can do directly @vintented : 1) Update the sample datasets for Mauro: The market and portfolio files do not have the scenario geography as a column at all, I suspect that the function would fail to run with your sample data sets. one question @maurolepore We probably use the market data also for other functions. Vincent already filtered it down to only the relevant columns for this function. Would it be better to have 1 dataset with all columns that can then be used for any function using the market or do you prefer to have a separate sample dataset for each function? 2) Map the output to the input! The output should be exactly like the portfolio input and just overwrite the "Scen.Sec.EmissionsFactor" with the newly calculated "Scen.Sec.EmissionsFactor_port", the rest should be the same: same columns, etc.
Feel free to reach out to me on Slack!
RE (@2diiKlaus) "the market data":
Would it be better to have 1 dataset with all columns that can then be used for any function using the market or do you prefer to have a separate sample dataset for each function?
I think having a single market
dataset would be best (easier to maintain and easier to use).
But I think it's okay for @vintented to now provide exclusively the columns that this function needs, and we can add more columns whenever we need them.
If the market
dataset is useful beyond the scope of r2dii.analysis I think we should eventually move it to r2dii.dataraw where we describe generic datasets and store similar *_demo
datasets for examples and tests.
Relates to PR https://github.com/2DegreesInvesting/r2dii.analysis/pull/8.
@maurolepore @2diiKlaus
I have updated the function below to reflect Klaus's comments and added a new version of the sample data. Now the output is the same as the input data frame with only the Scen.Sec.EmissionsFactor updated.
sda_calculation <- function(market_data, port_data, ref_sector = c("Cement", "Steel"), ref_scenario = "B2DS", ref_geography = "Global", start_year = 2019, target_year = 2040) {
startender <- function(input_data, var = Plan.Sec.EmissionsFactor, year = start_year) {
var <- enquo(var)
output_data <- input_data %>%
filter(!is.nan(!!var) & Year == year & Scenario %in% ref_scenario & Sector %in% ref_sector & ScenarioGeography %in% ref_geography) %>%
rename(CI = !!var) %>%
distinct(Investor.Name, Portfolio.Name, CI, Scenario, ScenarioGeography, Sector, Allocation)
return(output_data)
}
CI_port <- startender(input_data = port_data)
CI_market <- startender(input_data = market_data)
SI <- startender(input_data = market_data, year = target_year, var = Scen.Sec.EmissionsFactor) %>%
rename(SI = CI)
Distance <- CI_market %>%
inner_join(SI, by = c("Sector", "Scenario", "Allocation", "Portfolio.Name", "Investor.Name", "ScenarioGeography")) %>%
inner_join(CI_port, by = c("Sector", "Scenario", "Allocation", "ScenarioGeography"), suffix = c("_market", "_port")) %>%
mutate(D_port = CI_port - SI)
view <- function(input_data = port_data) {
output_data <- input_data %>%
filter(Scenario %in% ref_scenario & Sector %in% ref_sector & ScenarioGeography %in% ref_geography) %>%
distinct(Investor.Name, Portfolio.Name, Allocation, Sector, Scenario, ScenarioGeography, Year, Scen.Sec.EmissionsFactor)
return(output_data)
}
market_view <- view(input_data = market_data)
port_view <- view(input_data = port_data)
port_to_market <- market_view %>%
select(-c(Investor.Name, Portfolio.Name)) %>%
inner_join(port_view, by = c("Sector", "Year", "Allocation", "ScenarioGeography", "Scenario"), suffix = c("_port", "_market"))
port_to_distance <- port_to_market %>%
inner_join(Distance, by = c("Scenario", "Sector", "Investor.Name" = "Investor.Name_port", "Portfolio.Name" = "Portfolio.Name_port", "Allocation", "ScenarioGeography"))
port_calculation <- port_to_distance %>%
mutate(P_market = (Scen.Sec.EmissionsFactor_market - SI)/(CI_market - SI),
Scen.Sec.EmissionsFactor = (D_port*1*P_market)+SI)
port_calculation <- port_calculation %>%
select(Investor.Name, Portfolio.Name, Allocation, Sector, Scenario, ScenarioGeography, Year, Scen.Sec.EmissionsFactor)
port_data <- port_calculation %>%
right_join(port_data, by = c("Investor.Name", "Portfolio.Name", "Allocation", "Scenario", "Sector", "ScenarioGeography", "Year"), suffix = c("", "_sda"))
port_data <- port_data %>%
mutate(Scen.Sec.EmissionsFactor = if_else(!is.na(Scen.Sec.EmissionsFactor_sda), Scen.Sec.EmissionsFactor_sda, Scen.Sec.EmissionsFactor)) %>%
select(-Scen.Sec.EmissionsFactor_sda)
return(port_data)
}
The latest code looks good @vintented Can you adjust the #8 accordingly. To me it looks like the last changes were not implemented yet. Maybe I am missing something, then I might need help by @maurolepore I checked those files: https://github.com/2DegreesInvesting/r2dii.analysis/pull/8/files
Once this is done I will test it with portfolios.
Thanks @maurolepore and @vintented
I guess once it is implemented in the PR, we can close this issue and just continue the discussion in one thread. Do you agree @maurolepore?
Yup, agree -- the discussion follows at #10
Is your feature request related to a problem? Please describe. Integration of SDA calculation function into r2dii.analysis for @2diiKlaus to review.
Describe the solution you'd like The function should ideally take the project results of portcheck and the analysis inputs to calculate the SDA approach for the cement & steel sectors. This should be done for each portfolio in the project. The column B2DS_path_port is the scenario consistent emissions factor necessary to achieve the 2040 target. I should add, the function can work with any scenario theory and that could be added as a function parameter.
Describe alternatives you've considered Adding other sectors with emission factors.
Additional context The plotted output should look something like this.
Can you provide something like this? Yes, it just needs to be confirmed that calculation is correct and code is robust.
SDA_sample_output.xlsx