Develop an undefined, dictionary-template

maurolepore commented 4 months ago

Relates to https://github.com/2DegreesInvesting/tiltDevTools/pull/9

@Tilmon here I'll post data-dictionary templates. You may move it to a googlesheet and define each column -- maybe with the help of @kalashsinghal

I start with emissions and sector profile because those are the most complete. Maybe the outputs will change but that should impact relatively few columns.

reprex

## Emissions ``` r sector <- readr::read_csv("tiltIndicatorAfter-v0.0.0.9040-emissions.csv") #> Rows: 42 Columns: 5 #> ── Column specification ──────────────────────────────────────────────────────── #> Delimiter: "," #> chr (4): dataset, level, name, type #> lgl (1): definition #> #> ℹ Use `spec()` to retrieve the full column specification for this data. #> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. ``` ``` r head(sector) #> # A tibble: 6 × 5 #> dataset level name type definition #> #> 1 emissions product companies_id character NA #> 2 emissions product company_name character NA #> 3 emissions product country character NA #> 4 emissions product emission_profile character NA #> 5 emissions product benchmark character NA #> 6 emissions product ep_product character NA ``` ``` r tail(sector) #> # A tibble: 6 × 5 #> dataset level name type definition #> #> 1 emissions company company_city character NA #> 2 emissions company postcode character NA #> 3 emissions company address character NA #> 4 emissions company main_activity character NA #> 5 emissions company profile_ranking_avg double NA #> 6 emissions company co2_avg double NA ``` ## Sector ``` r sector <- readr::read_csv("tiltIndicatorAfter-v0.0.0.9040-sector.csv") #> Rows: 39 Columns: 5 #> ── Column specification ──────────────────────────────────────────────────────── #> Delimiter: "," #> chr (4): dataset, level, name, type #> lgl (1): definition #> #> ℹ Use `spec()` to retrieve the full column specification for this data. #> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. ``` ``` r head(sector) #> # A tibble: 6 × 5 #> dataset level name type definition #> #> 1 sector product companies_id character NA #> 2 sector product company_name character NA #> 3 sector product country character NA #> 4 sector product sector_profile character NA #> 5 sector product reduction_targets double NA #> 6 sector product scenario character NA ``` ``` r tail(sector) #> # A tibble: 6 × 5 #> dataset level name type definition #> #> 1 sector company matching_certainty_company_average character NA #> 2 sector company company_city character NA #> 3 sector company postcode character NA #> 4 sector company address character NA #> 5 sector company main_activity character NA #> 6 sector company reduction_targets_avg double NA ```

Here is how I did it. The trick is to use [this helper](https://2degreesinvesting.github.io/tiltDevTools/reference/extensions.html). ## Emissions ```r library(readr, warn.conflicts = FALSE) library(tiltToyData) library(tiltIndicatorAfter) library(tiltDevTools) companies <- read_csv(toy_emissions_profile_any_companies()) products <- read_csv(toy_emissions_profile_products_ecoinvent()) europages_companies <- read_csv(toy_europages_companies()) ecoinvent_activities <- read_csv(toy_ecoinvent_activities()) ecoinvent_europages <- read_csv(toy_ecoinvent_europages()) isic_name <- read_csv(toy_isic_name()) emissions <- profile_emissions( companies, products, europages_companies = europages_companies, ecoinvent_activities = ecoinvent_activities, ecoinvent_europages = ecoinvent_europages, isic = isic_name ) version <- packageVersion("tiltIndicatorAfter") emissions |> use_dictionary() |> write_csv(glue::glue("tiltIndicatorAfter-v{version}-emissions.csv")) ``` ## Sector ```r library(readr, warn.conflicts = FALSE) library(tiltToyData) library(tiltIndicatorAfter) library(tiltDevTools) companies <- read_csv(toy_sector_profile_companies()) scenarios <- read_csv(toy_sector_profile_any_scenarios()) europages_companies <- read_csv(toy_europages_companies()) |> head(3) ecoinvent_activities <- read_csv(toy_ecoinvent_activities()) |> head(3) ecoinvent_europages <- read_csv(toy_ecoinvent_europages()) |> head(3) isic_name <- read_csv(toy_isic_name()) |> head(3) sector <- profile_sector( companies, scenarios, europages_companies = europages_companies, ecoinvent_activities = ecoinvent_activities, ecoinvent_europages = ecoinvent_europages, isic = isic_name ) version <- packageVersion("tiltIndicatorAfter") sector |> use_dictionary() |> write_csv(glue::glue("tiltIndicatorAfter-v{version}-sector.csv")) ```

Tilmon commented 4 months ago

Thanks, @maurolepore. Is this for the documentation of the indicators? As you, @kalashsinghal, are more up-to-date about the latest column names etc., I'd suggest that you start and I can add / review where you are unsure? Would that be OK?

maurolepore commented 4 months ago

@Tilmon it relates to the app prototype that @AnneSchoenauer shared. I also thought Kalash (and Anne) would be the best fit for the job, but Anne assigned you:

I am assigining @Tilmon as he is responsible for the data dictionary. -- Anne via https://github.com/2DegreesInvesting/tiltWebTool/issues/9#issuecomment-2126334413

--

Notes that this specific issue focuses on building the template. I already did that so I closed the issue. Let's discuss your part at :

https://github.com/2DegreesInvesting/tiltWebTool/issues/12

AnneSchoenauer commented 4 months ago

Hi @Tilmon - I thought that you are doing the data dictionary as you are responsible for uploading the data ;-). So maybe good if you want to give this to Kalash to specify what a data dictionary is. Happy to help if needed.

kalashsinghal commented 4 months ago

As you, @kalashsinghal, are more up-to-date about the latest column names etc., I'd suggest that you start and I can add / review where you are unsure? Would that be OK?

@Tilmon The latest column names will be same as the final column names we get from profile_emissions and profile_sector functions of tiltIndicatorAfter (can be seen from above reprex). If you would like to assign me this task then please create a separate ticket for me and adjust the priority of this task on my board! Thanks! :)

maurolepore commented 4 months ago

@kalashsinghal that ticket is already open at https://github.com/2DegreesInvesting/tiltWebTool/issues/12

2DegreesInvesting / tiltWebTool

Develop an undefined, dictionary-template #11