RMI-PACTA / r2dii.analysis

Tools to Calculate Climate Targets for Financial Portfolios
https://rmi-pacta.github.io/r2dii.analysis
Other
11 stars 9 forks source link

Potential Bug: Handling NAs in the ABCD #423

Closed jacobvjk closed 5 months ago

jacobvjk commented 1 year ago

This is an example of one company in the loan book and abcd that has no capacity at all in the power sector in the beginning of the timeframe of the analysis, but enters the sector later on (2024). With the current implementation that removes NA rows from the ABCD, this effectively moves the start year of the analysis into the future, with impacts on how scenario targets are calculated. My intuition tends to to favor replacing the NAs with 0s which keeps the start year stable (to where the scenario has tmsr == 1 or smsp == 0) and it avoids requesting buildouts from companies that were not active in a sector at the beginning of the analysis

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(r2dii.data)
library(r2dii.match)
library(r2dii.analysis)

loanbook <- loanbook_demo %>% 
  filter(id_loan == "L1") %>% 
  mutate(name_direct_loantaker = "test_company",
         name_ultimate_parent = "test_company")

abcd <- tibble::tribble(
  ~company_id,  ~name_company,           ~lei,  ~is_ultimate_owner, ~sector,     ~technology,   ~plant_location, ~year, ~production,    ~production_unit,   ~emission_factor,   ~emission_factor_unit,  ~ald_timestamp,
  1L, "test_company", NA_character_,                TRUE, "power", "renewablescap",            "PL", 2020L,    NA_real_,              "MW",         NA_real_,         NA_character_,       "2021 Q4",
  1L, "test_company", NA_character_,                TRUE, "power", "renewablescap",            "PL", 2021L,    NA_real_,              "MW",         NA_real_,         NA_character_,       "2021 Q4",
  1L, "test_company", NA_character_,                TRUE, "power", "renewablescap",            "PL", 2022L,    NA_real_,              "MW",         NA_real_,         NA_character_,       "2021 Q4",
  1L, "test_company", NA_character_,                TRUE, "power", "renewablescap",            "PL", 2023L,    NA_real_,              "MW",         NA_real_,         NA_character_,       "2021 Q4",
  1L, "test_company", NA_character_,                TRUE, "power", "renewablescap",            "PL", 2024L,         200,              "MW",         NA_real_,         NA_character_,       "2021 Q4",
  1L, "test_company", NA_character_,                TRUE, "power", "renewablescap",            "PL", 2025L,         200,              "MW",         NA_real_,         NA_character_,       "2021 Q4",
  1L, "test_company", NA_character_,                TRUE, "power", "renewablescap",            "PL", 2026L,         200,              "MW",         NA_real_,         NA_character_,       "2021 Q4"
)

matched <- match_name(loanbook, abcd) %>%
  prioritize()

scenario_test <- scenario_demo_2020 %>% 
  # to simulate start year 2021
  mutate(year = year + 1) %>% 
  filter(
    sector == "power",
    year %in% c(2021:2026),
    scenario == "sds",
    region == "global"
  )

# using abcd with NAs in the beginning of the analysis
abcd_na <- abcd
# the analysis automatically moves the start year up until the first year with a
# positive value. This implies that even a company that is not active in a given
# sector at all at the start of the analysis will get a positive target and is
# "punished" for being misaligned. This seems counterintuitive.
matched %>%
  target_market_share(
    abcd = abcd_na,
    scenario = scenario_test,
    region_isos = r2dii.data::region_isos_demo,
    weight_production = FALSE,
    by_company = TRUE
  ) %>% 
  filter(
    technology == "renewablescap",
    metric %in% c("projected", "target_sds")
  )
#> Warning: Removing rows in abcd where `production` is NA
#> # A tibble: 6 × 11
#>   sector technology     year region scena…¹ name_…² metric produ…³ techn…⁴ scope
#>   <chr>  <chr>         <dbl> <chr>  <chr>   <chr>   <chr>    <dbl>   <dbl> <chr>
#> 1 power  renewablescap  2024 global demo_2… test_c… proje…    200    1     sect…
#> 2 power  renewablescap  2024 global demo_2… test_c… targe…    212.   0.996 sect…
#> 3 power  renewablescap  2025 global demo_2… test_c… proje…    200    1     sect…
#> 4 power  renewablescap  2025 global demo_2… test_c… targe…    216.   0.995 sect…
#> 5 power  renewablescap  2026 global demo_2… test_c… proje…    200    1     sect…
#> 6 power  renewablescap  2026 global demo_2… test_c… targe…    220.   0.994 sect…
#> # … with 1 more variable: percentage_of_initial_production_by_scope <dbl>, and
#> #   abbreviated variable names ¹​scenario_source, ²​name_abcd, ³​production,
#> #   ⁴​technology_share

# using abcd with NAs replaced with 0
abcd["production"][is.na(abcd["production"])] <- 0
# the analysis understands that the start year is indeed 2021 and since the
# company is not active in the power sector at all in the start year, it is not
# required to build out capacity by the smsp. An entry into the sector means
# that in case of building out green technologies, these will be aligned, as
# they are "unexpected" and high-carbon technologoes will remain misaligned for
# the same reason
abcd_zero <- abcd
matched %>%
  target_market_share(
    abcd = abcd_zero,
    scenario = scenario_test,
    region_isos = r2dii.data::region_isos_demo,
    weight_production = FALSE,
    by_company = TRUE
  ) %>% 
  filter(
    technology == "renewablescap",
    metric %in% c("projected", "target_sds")
  )
#> # A tibble: 12 × 11
#>    sector technology    year region scena…¹ name_…² metric produ…³ techn…⁴ scope
#>    <chr>  <chr>        <dbl> <chr>  <chr>   <chr>   <chr>    <dbl>   <dbl> <chr>
#>  1 power  renewablesc…  2021 global demo_2… test_c… proje…       0      NA sect…
#>  2 power  renewablesc…  2021 global demo_2… test_c… targe…       0      NA sect…
#>  3 power  renewablesc…  2022 global demo_2… test_c… proje…       0      NA sect…
#>  4 power  renewablesc…  2022 global demo_2… test_c… targe…       0      NA sect…
#>  5 power  renewablesc…  2023 global demo_2… test_c… proje…       0      NA sect…
#>  6 power  renewablesc…  2023 global demo_2… test_c… targe…       0      NA sect…
#>  7 power  renewablesc…  2024 global demo_2… test_c… proje…     200       1 sect…
#>  8 power  renewablesc…  2024 global demo_2… test_c… targe…       0      NA sect…
#>  9 power  renewablesc…  2025 global demo_2… test_c… proje…     200       1 sect…
#> 10 power  renewablesc…  2025 global demo_2… test_c… targe…       0      NA sect…
#> 11 power  renewablesc…  2026 global demo_2… test_c… proje…     200       1 sect…
#> 12 power  renewablesc…  2026 global demo_2… test_c… targe…       0      NA sect…
#> # … with 1 more variable: percentage_of_initial_production_by_scope <dbl>, and
#> #   abbreviated variable names ¹​scenario_source, ²​name_abcd, ³​production,
#> #   ⁴​technology_share
jdhoffa commented 1 year ago

@jacobvjk sorry it took me ages to look at this. I totally agree, this is a bug. Thanks for flagging.