Closed jdhoffa closed 5 months ago
It seems like the 2023-06-18
version of 2022Q4
PAMS data uses: tCO2/pkm
This also seems to be what pacta.data.validation
expects:
https://github.com/RMI-PACTA/pacta.data.validation/blob/6b2ce8eabc5e42eff03743918e115a8423d25c68/R/assert_valid_units.R#L7
For reference, @jacobvjk and @Antoine-Lalechere helped confirmed while creating GECO 2023 that GECO 2022 preparation was wrong and that it should be tCO2/pkm
not gCO2/pkm
...
https://github.com/RMI-PACTA/pacta.scenario.preparation/pull/133#issuecomment-1967261799
We also need to assess if the actual value
and unit
are describing the same thing.
e.g. does the unit
say gCO2/pkm
but the actual value is in tCO2/pkm
, or do we also need to convert the value itself.
For reference, this is how we ended up converting the value in GECO 2023 https://github.com/RMI-PACTA/pacta.scenario.data.preparation/blob/1d55a73a1bff4fb3bac9feffda36c59e74d30990/R/prepare_geco_2023_scenario.R#L181-L187
Documenting that in pacta.data.preparation
there seem to already be unit issues. For the GECO scenarios for aviation, I get the following:
# 2021
pacta.scenario.preparation::geco_2021 |>
dplyr::filter(sector == "Aviation") |>
dplyr::pull(value) |>
range()
#> [1] 6.093293e-05 1.623860e-04
pacta.scenario.preparation::geco_2021 |>
dplyr::filter(sector == "Aviation") |>
dplyr::distinct(units)
#> # A tibble: 1 × 1
#> units
#> <chr>
#> 1 gCO2/ pkm
# 2022
pacta.scenario.preparation::geco_2022 |>
dplyr::filter(sector == "Aviation") |>
dplyr::pull(value) |>
range()
#> [1] 1.03305e-05 1.45711e-04
pacta.scenario.preparation::geco_2022 |>
dplyr::filter(sector == "Aviation") |>
dplyr::distinct(units)
#> # A tibble: 1 × 1
#> units
#> <chr>
#> 1 gCO2/pkm
# 2023
pacta.scenario.preparation::geco_2023 |>
dplyr::filter(sector == "Aviation") |>
dplyr::pull(value) |>
range()
#> [1] 2.206271e-05 1.253751e-04
pacta.scenario.preparation::geco_2023 |>
dplyr::filter(sector == "Aviation") |>
dplyr::distinct(units)
#> # A tibble: 1 × 1
#> units
#> <chr>
#> 1 tCO2/pkm
Created on 2024-04-12 with reprex v2.1.0
I would expect the dataset with units of tCO2/pkm
to be 3 orders of magnitude smaller than similar data in gCO2/pkm
.
In PAMS, it appears as (pseudo reprex):
pams |>
dplyr::filter(ald_sector == "Aviation") |>
dplyr::filter(ald_emissions_factor_unit == "tCO2/pkm") |>
dplyr::pull(ald_emissions_factor) |>
range()
range: (5.58836e-05, 5.628335e-04) in tCO2/pkm
so based on that, I would guess that for aviation: we expect/ want units of tCO2/pkm and we would expect it on the order of magnitude (1e-05, 1e-04)
which seems to agree with: https://github.com/RMI-PACTA/pacta.scenario.preparation/pull/133#issuecomment-1967261799
Based on offline discussions with @Antoine-Lalechere and @cjyetman, the decision points are:
tCO2/pkm
without changing the actual values, they seemed to be mis-alignedpacta.data.validation currently checks for max values, currently 0.0002
for Aviation in tCO2/pkm
"Aviation", "tCO2/pkm", 0.0002,
lower bound is currently 0 for all values https://github.com/RMI-PACTA/pacta.data.validation/blob/a40e0538eab0d4c77c6cbe2390377ffb6a18e7c5/R/assert_valid_value_range_for_sector_unit_scenario_prep.R#L29
checkmate::assert_numeric(values[sectors_units_idx], lower = 0, upper = max_value_i, any.missing = any.missing, add = add, .var.name = .var.name)
e.g.: For the sector
aviation
we don't know what emissions intensity unit to use, options:gCO2/pkm
or...
?I will try to get an answer :-)
(Also in general, I would say since the
waldo
regression test passes, this isn't a blocking point for THIS PR, but rather we should open a new issue for it)Originally posted by @jdhoffa in https://github.com/RMI-PACTA/pacta.scenario.data.preparation/issues/22#issuecomment-2024910508