RMI-PACTA / pacta.data.validation

pacta.data.validation
https://rmi-pacta.github.io/pacta.data.validation/
Other
2 stars 0 forks source link

include recent spellings of production units in AI data #66

Closed cjyetman closed 4 months ago

cjyetman commented 4 months ago

Recent versions of AI data include new spellings of production units.

library(tidyverse)
pams_path <- "~/data/pactarawdata/asset-impact/2024-02-15_AI_RMI_2023Q4/2024-02-14_AI_2023Q4_RMI-Company-Indicators.xlsx"
pams <- pacta.data.preparation::import_ar_advanced_company_indicators(pams_path)
pams %>% filter(value_type == "production") %>% select(`Asset Sector`, `Activity Unit`) %>% distinct() %>% arrange(`Asset Sector`, `Activity Unit`)
#> # A tibble: 11 × 2
#>    `Asset Sector` `Activity Unit`
#>    <fct>          <fct>          
#>  1 Aviation       pkm            
#>  2 Aviation       tkm            
#>  3 Cement         t cement       
#>  4 Coal           t coal         
#>  5 HDV            # vehicles     
#>  6 LDV            # vehicles     
#>  7 Oil&Gas        GJ             
#>  8 Power          MW             
#>  9 Power          MWh            
#> 10 Shipping       dwt km         
#> 11 Steel          t steel
github-actions[bot] commented 4 months ago
Coverage Report |file|head|main|diff| | :-- | --: | --: | --: | |Overall|95%|95%|:arrow_up: 0.027%| |R/assert_columns_exists.R|100%|100%|0%| |R/assert_regex.R|100%|100%|0%| |R/assert_subset.R| 92%| 92%|0%| |R/assert_valid_ai_company_id.R| 0%| 0%|0%| |R/assert_valid_asset_type.R|100%|100%|0%| |R/assert_valid_bics_sector_code.R| 0%| 0%|0%| |R/assert_valid_bics_subgroup_code.R| 0%| 0%|0%| |R/assert_valid_emissions_factor_unit.R|100%|100%|0%| |R/assert_valid_equity_market.R| 0%| 0%|0%| |R/assert_valid_factset_entity_id.R|100%|100%|0%| |R/assert_valid_factset_fund_id.R| 0%| 0%|0%| |R/assert_valid_factset_sym_id.R| 0%| 0%|0%| |R/assert_valid_indicator_for_sector.R|100%|100%|0%| |R/assert_valid_indicator.R|100%|100%|0%| |R/assert_valid_isin.R|100%|100%|0%| |R/assert_valid_iso2c.R|100%|100%|0%| |R/assert_valid_iso4217c.R|100%|100%|0%| |R/assert_valid_production_unit.R|100%|100%|0%| |R/assert_valid_scenario_geography.R| 0%| 0%|0%| |R/assert_valid_sector.R| 98%| 98%|0%| |R/assert_valid_sectors_with_assets.R|100%|100%|0%| |R/assert_valid_technology_for_sector.R|100%|100%|0%| |R/assert_valid_technology.R|100%|100%|0%| |R/assert_valid_units.R|100%|100%|0%| |R/assert_valid_value_range_for_sector_unit_scenario_prep.R|100%|100%|0%| |R/fake_abcd_flags_bonds.R|100%|100%|0%| |R/fake_abcd_flags_equity.R|100%|100%|0%| |R/fake_currencies.R|100%|100%|0%| |R/fake_financial_data.R|100%|100%|0%| |R/fake_intermediate_scenario_data.R|100%|100%|0%| |R/fake_masterdata_debt_datastore.R|100%|100%|0%| |R/fake_masterdata_ownership_datastore.R|100%|100%|0%| |R/is_valid_isin.R|100%|100%|0%| |R/matches_regex.R|100%|100%|0%| |R/set_collapse.R|100%|100%|0%| |R/simplify_if_one_col_df.R|100%|100%|0%| |R/validate_abcd_flags_bonds.R|100%|100%|0%| |R/validate_abcd_flags_equity.R|100%|100%|0%| |R/validate_currencies.R|100%|100%|0%| |R/validate_financial_data.R|100%|100%|0%| |R/validate_intermediate_scenario_output.R|100%|100%|0%| |R/validate_masterdata_debt_datastore.R|100%|100%|0%| |R/validate_masterdata_ownership_datastore.R|100%|100%|0%|
cjyetman commented 4 months ago

NB: does/ will/ should this affect anything downstream?

from what I've seen/realized so far, this does not impact anything other than allowing the data validation functions to pass, which are not implemented in production anywhere (yet)

NB: do we need to adjust production units in scenarios?

this might suggest that, but I honestly don't know yet... if so, it probably implies that we need a standardize_units() type function in pacta.data.preparation that would standardize units from our various source data (AI, scenarios, ISS?) to some PACTA standard unit names

jacobvjk commented 4 months ago

does it imply changes in PAMS only or also in the banks data set? I am using the activity units based on r2dii.data::abcd_demo[["production_unit"]] in P4S

cjyetman commented 4 months ago

does it imply changes in PAMS only or also in the banks data set? I am using the activity units based on r2dii.data::abcd_demo[["production_unit"]] in P4S

I have too little experience with Banks data to know. I'm tempted to say "yes", since I somewhat doubt they would bother to modify custom units specifically for the Banks data, but I really have no context. These new units are in the PAMS dataset that we received in Feb for 23Q4, and I assume we also received a set of Banks data at the same time (?), so should be easy to check.