RMI-PACTA / pacta.scenario.data.preparation

The goal of {pacta.scenario.data.preparation} is to prepare and format all scenario input datasets required to run the {pacta.portfolio.allocate} tool.
https://rmi-pacta.github.io/pacta.scenario.data.preparation/
Other
1 stars 0 forks source link

bug: align on `aviation` values of `emission_factor_unit` #23

Closed jdhoffa closed 5 months ago

jdhoffa commented 5 months ago
          @cjyetman can you explain specifically what this issue is? 

e.g.: For the sector aviation we don't know what emissions intensity unit to use, options: gCO2/pkm or ...?

I will try to get an answer :-)

(Also in general, I would say since the waldo regression test passes, this isn't a blocking point for THIS PR, but rather we should open a new issue for it)

Originally posted by @jdhoffa in https://github.com/RMI-PACTA/pacta.scenario.data.preparation/issues/22#issuecomment-2024910508

jdhoffa commented 5 months ago

It seems like the 2023-06-18 version of 2022Q4 PAMS data uses: tCO2/pkm

Screenshot 2024-03-28 at 12 03 32
jdhoffa commented 5 months ago

This also seems to be what pacta.data.validation expects: https://github.com/RMI-PACTA/pacta.data.validation/blob/6b2ce8eabc5e42eff03743918e115a8423d25c68/R/assert_valid_units.R#L7

cjyetman commented 5 months ago

For reference, @jacobvjk and @Antoine-Lalechere helped confirmed while creating GECO 2023 that GECO 2022 preparation was wrong and that it should be tCO2/pkm not gCO2/pkm... https://github.com/RMI-PACTA/pacta.scenario.preparation/pull/133#issuecomment-1967261799

jdhoffa commented 5 months ago

We also need to assess if the actual value and unit are describing the same thing. e.g. does the unit say gCO2/pkm but the actual value is in tCO2/pkm, or do we also need to convert the value itself.

cjyetman commented 5 months ago

For reference, this is how we ended up converting the value in GECO 2023 https://github.com/RMI-PACTA/pacta.scenario.data.preparation/blob/1d55a73a1bff4fb3bac9feffda36c59e74d30990/R/prepare_geco_2023_scenario.R#L181-L187

jdhoffa commented 5 months ago

Documenting that in pacta.data.preparation there seem to already be unit issues. For the GECO scenarios for aviation, I get the following:

# 2021
pacta.scenario.preparation::geco_2021 |> 
  dplyr::filter(sector == "Aviation") |> 
  dplyr::pull(value) |> 
  range()
#> [1] 6.093293e-05 1.623860e-04

pacta.scenario.preparation::geco_2021 |> 
  dplyr::filter(sector == "Aviation") |> 
  dplyr::distinct(units)
#> # A tibble: 1 × 1
#>   units    
#>   <chr>    
#> 1 gCO2/ pkm

# 2022
pacta.scenario.preparation::geco_2022 |> 
  dplyr::filter(sector == "Aviation") |> 
  dplyr::pull(value) |> 
  range()
#> [1] 1.03305e-05 1.45711e-04

pacta.scenario.preparation::geco_2022 |> 
  dplyr::filter(sector == "Aviation") |> 
  dplyr::distinct(units)
#> # A tibble: 1 × 1
#>   units   
#>   <chr>   
#> 1 gCO2/pkm

# 2023
pacta.scenario.preparation::geco_2023 |> 
  dplyr::filter(sector == "Aviation") |> 
  dplyr::pull(value) |> 
  range()
#> [1] 2.206271e-05 1.253751e-04

pacta.scenario.preparation::geco_2023 |> 
  dplyr::filter(sector == "Aviation") |> 
  dplyr::distinct(units)
#> # A tibble: 1 × 1
#>   units   
#>   <chr>   
#> 1 tCO2/pkm

Created on 2024-04-12 with reprex v2.1.0

jdhoffa commented 5 months ago

I would expect the dataset with units of tCO2/pkm to be 3 orders of magnitude smaller than similar data in gCO2/pkm.

jdhoffa commented 5 months ago

In PAMS, it appears as (pseudo reprex):

pams |> 
  dplyr::filter(ald_sector == "Aviation") |> 
  dplyr::filter(ald_emissions_factor_unit == "tCO2/pkm") |> 
  dplyr::pull(ald_emissions_factor) |> 
  range()

range: (5.58836e-05, 5.628335e-04) in tCO2/pkm

so based on that, I would guess that for aviation: we expect/ want units of tCO2/pkm and we would expect it on the order of magnitude (1e-05, 1e-04)

which seems to agree with: https://github.com/RMI-PACTA/pacta.scenario.preparation/pull/133#issuecomment-1967261799

jdhoffa commented 5 months ago

Based on offline discussions with @Antoine-Lalechere and @cjyetman, the decision points are:

cjyetman commented 5 months ago

pacta.data.validation currently checks for max values, currently 0.0002 for Aviation in tCO2/pkm

https://github.com/RMI-PACTA/pacta.data.validation/blob/a40e0538eab0d4c77c6cbe2390377ffb6a18e7c5/R/assert_valid_value_range_for_sector_unit_scenario_prep.R#L10

"Aviation",         "tCO2/pkm",     0.0002,

lower bound is currently 0 for all values https://github.com/RMI-PACTA/pacta.data.validation/blob/a40e0538eab0d4c77c6cbe2390377ffb6a18e7c5/R/assert_valid_value_range_for_sector_unit_scenario_prep.R#L29

checkmate::assert_numeric(values[sectors_units_idx], lower = 0, upper = max_value_i, any.missing = any.missing, add = add, .var.name = .var.name)