Closed maurolepore closed 5 months ago
@kalashsinghal when you review this PR note the new datasets have columns that the old datasets don't have. Are they all necessary?
devtools::load_all()
#> ℹ Loading tiltToyData
library(readr, warn.conflicts = FALSE)
options(readr.show_col_types = FALSE)
# companies
new <- read_csv(toy_emissions_profile_any_companies())
old <- read_csv(deprecated_path("emissions_profile_any_companies.csv.gz"))
setdiff(names(old), names(new))
#> character(0)
setdiff(names(new), names(old))
#> [1] "country" "ei_activity_name" "main_activity"
# products
old <- read_csv(toy_emissions_profile_products())
new <- read_csv(toy_emissions_profile_products_ecoinvent())
setdiff(names(old), names(new))
#> character(0)
setdiff(names(new), names(old))
#> [1] "ei_geography"
# upstrem_products
old <- read_csv(toy_emissions_profile_upstream_products())
new <- read_csv(toy_emissions_profile_upstream_products_ecoinvent())
setdiff(names(old), names(new))
#> character(0)
setdiff(names(new), names(old))
#> [1] "ei_geography" "input_reference_product_name"
Created on 2024-01-05 with reprex v2.0.2
Thanks @Kalash,
The raw data I used seems to lack the columns you want. And the original link you shared seems to no longer be valid. But the updated file here has the columns. So good to go :-)
https://drive.google.com/drive/folders/1AbSGCGFVcRM3zLfPg5FdwScTRRaCbIws
Maybe we can post this link to the files in the README file of tiltIndicatorBefore?
I updated the datasets with those from https://drive.google.com/drive/folders/1AbSGCGFVcRM3zLfPg5FdwScTRRaCbIws. Now emissions_profile_upstream_products_ecoinvent.csv
has the *activity_name
columns.
devtools::load_all()
#> ℹ Loading tiltToyData
library(readr, warn.conflicts = FALSE)
options(readr.show_col_types = FALSE)
# companies
new <- read_csv(toy_emissions_profile_any_companies())
old <- read_csv(deprecated_path("emissions_profile_any_companies.csv.gz"))
setdiff(names(old), names(new))
#> character(0)
setdiff(names(new), names(old))
#> [1] "country" "ei_activity_name" "main_activity"
# products
old <- read_csv(toy_emissions_profile_products())
new <- read_csv(toy_emissions_profile_products_ecoinvent())
setdiff(names(old), names(new))
#> character(0)
setdiff(names(new), names(old))
#> [1] "ei_geography"
# upstrem_products
old <- read_csv(toy_emissions_profile_upstream_products())
new <- read_csv(toy_emissions_profile_upstream_products_ecoinvent())
setdiff(names(old), names(new))
#> character(0)
setdiff(names(new), names(old))
#> [1] "ei_activity_name" "ei_geography"
#> [3] "input_ei_activity_name" "input_reference_product_name"
library(readr, warn.conflicts = FALSE)
library(tiltIndicator)
devtools::load_all()
#> ℹ Loading tiltToyData
options(readr.show_col_types = FALSE, width = 500)
companies <- read_csv(toy_emissions_profile_any_companies())
companies
#> # A tibble: 76 × 7
#> activity_uuid_product_uuid clustered companies_id country ei_activity_name main_activity unit
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 76269c17-78d6-420b-991a-aa38c51b45b7 tent soot_asianpiedstarling germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2
#> 2 76269c17-78d6-420b-991a-aa38c51b45b7 table hire for parties frightening_chrysomelid spain market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2
#> 3 76269c17-78d6-420b-991a-aa38c51b45b7 surface finishing, galvanic hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg
#> 4 76269c17-78d6-420b-991a-aa38c51b45b7 surface engineering hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg
#> 5 76269c17-78d6-420b-991a-aa38c51b45b7 tent flexible_dolphin austria market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2
#> 6 76269c17-78d6-420b-991a-aa38c51b45b7 tent paramilitary_racerunner germany market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2
#> 7 76269c17-78d6-420b-991a-aa38c51b45b7 open space amenities level_meadowhawk france market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2
#> 8 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb tent heartrending_attwatersprairiechicken germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2
#> 9 76269c17-78d6-420b-991a-aa38c51b45b7 tent traumatophobic_hanumanmonkey germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2
#> 10 76269c17-78d6-420b-991a-aa38c51b45b7 tent preliterary_toucan germany market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2
#> # ℹ 66 more rows
products <- read_csv(toy_emissions_profile_products_ecoinvent())
products
#> # A tibble: 18 × 8
#> activity_uuid_product_uuid co2_footprint ei_activity_name ei_geography isic_4digit tilt_sector tilt_subsector unit
#> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 833caa78-30df-4374-900f-7f88ab44075b 14.1 iron-nickel-chromium alloy production RER ''2410'' metals iron & steel kg
#> 2 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 0.419 market for deep drawing, steel, 10000 kN press, automode GLO ''2591'' metals other metals kg
#> 3 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 481. market for shed, large, wood, non-insulated, fire-unprotected GLO ''4100'' construction construction residential m2
#> 4 833caa78-30df-4374-900f-7f88ab44075b 9.47 iron-nickel-chromium alloy production RER ''2410'' metals iron & steel kg
#> 5 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 0.648 market for deep drawing, steel, 10000 kN press, automode GLO ''2591'' metals other metals kg
#> 6 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 276. market for shed, large, wood, non-insulated, fire-unprotected GLO ''4100'' construction construction residential m2
#> 7 833caa78-30df-4374-900f-7f88ab44075b 13.6 iron-nickel-chromium alloy production RER ''2410'' metals iron & steel kg
#> 8 76269c17-78d6-420b-991a-aa38c51b45b7 0.405 market for deep drawing, steel, 10000 kN press, automode GLO ''2591'' metals other metals kg
#> 9 76269c17-78d6-420b-991a-aa38c51b45b7 447. market for shed, large, wood, non-insulated, fire-unprotected GLO ''4100'' construction construction residential m2
#> 10 833caa78-30df-4374-900f-7f88ab44075b 14.7 iron-nickel-chromium alloy production RER ''2410'' metals iron & steel kg
#> 11 833caa78-30df-4374-900f-7f88ab44075b 0.390 market for deep drawing, steel, 10000 kN press, automode GLO ''2591'' metals other metals kg
#> 12 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 442. market for shed, large, wood, non-insulated, fire-unprotected GLO ''4100'' construction construction residential m2
#> 13 76269c17-78d6-420b-991a-aa38c51b45b7 14.1 iron-nickel-chromium alloy production RER ''2410'' metals iron & steel kg
#> 14 76269c17-78d6-420b-991a-aa38c51b45b7 0.884 market for deep drawing, steel, 10000 kN press, automode GLO ''2591'' metals other metals kg
#> 15 76269c17-78d6-420b-991a-aa38c51b45b7 321. market for shed, large, wood, non-insulated, fire-unprotected GLO ''4100'' construction construction residential m2
#> 16 833caa78-30df-4374-900f-7f88ab44075b 12.7 iron-nickel-chromium alloy production RER ''2410'' metals iron & steel kg
#> 17 76269c17-78d6-420b-991a-aa38c51b45b7 0.675 market for deep drawing, steel, 10000 kN press, automode GLO ''2591'' metals other metals kg
#> 18 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 435. market for shed, large, wood, non-insulated, fire-unprotected GLO ''4100'' construction construction residential m2
upstream_products <- read_csv(toy_emissions_profile_upstream_products_ecoinvent())
upstream_products
#> # A tibble: 96 × 11
#> activity_uuid_product_uuid ei_activity_name ei_geography input_activity_uuid_product_uuid input_co2_footprint input_ei_activity_name input_isic_4digit input_reference_product_name input_tilt_sector input_tilt_subsector input_unit
#> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb market for deep drawing, steel, 10000 kN press, automode RoW 55a5ac05-ab15-5a27-9d0e-6ecf840039f1_f10b8722-4be1-43d5-b17d-c51ad0e29d29 4.56e-1 deep drawing, steel, 10000 kN press, automode ''2591'' deep drawing, steel, 10000 kN press, automode metals other metals kg
#> 2 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb market for shed, large, wood, non-insulated, fire-unprotected RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 4.63e+2 shed construction, large, wood, non-insulated, fire-unprotected ''4100'' shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> 3 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production GLO bdc93cd8-00b4-5b3e-993e-b7fef7059e52_4e584f6f-2e71-4796-931e-bb9a273c161c 1.67e+0 market for anode, for metal electrolysis ''2790'' anode, for metal electrolysis industry machinery & equipment kg
#> 4 76269c17-78d6-420b-991a-aa38c51b45b7 iron-nickel-chromium alloy production RER fdb1f848-173f-5fe1-96a2-588171e87e30_c2c93af2-47cb-4ec7-a1bd-d3d572bca039 1.45e+8 electric arc furnace converter construction ''2815'' electric arc furnace converter industry other industry unit
#> 5 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER 95fcd1bb-4dc6-516a-a3b2-30a4f0530639_3b1d249a-c924-4d6c-8e1f-647f562daa54 5.30e-1 market for electric arc furnace dust ''3821'' electric arc furnace dust industry other industry kg
#> 6 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER daef2f9a-4108-52ae-90a7-fe64abad51bc_6e74937e-b691-4c49-9b8f-5ba44d7c081d 5.89e-1 market for electric arc furnace slag ''3821'' electric arc furnace slag industry other industry kg
#> 7 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER 3b190359-a32e-5294-af63-983f38ce6525_759b89bd-3aa6-42ad-b767-5bb9ef5d331d 6.02e-1 market group for electricity, medium voltage ''3510'' electricity, medium voltage power total power kWh
#> 8 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production GLO 2c92cdcd-29df-53ba-a209-77c7de201d14_6e316c64-0481-4832-b097-296e14c0b02f 7.32e+0 market for ferrochromium, high-carbon, 68% Cr ''2410'' ferrochromium, high-carbon, 68% Cr metals iron & steel kg
#> 9 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production Europe, without Russia and Turkey 9392c694-12a6-5cd7-a421-d4866359df2c_0d3eda5a-4601-4573-9549-0701c459ab88 7.10e-1 market for hard coal ''0510'' hard coal energy coal energy kg
#> 10 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production CH c18c6cc9-4a26-5c47-9ea9-8635ff2c158e_240c1a3c-1aba-4528-afc3-3f27f56583be 1.06e-2 market for inert waste, for final disposal ''3821'' inert waste, for final disposal industry other industry kg
#> # ℹ 86 more rows
result <- emissions_profile(companies, products)
result |> unnest_company()
#> # A tibble: 1,296 × 4
#> companies_id grouped_by risk_category value
#> <chr> <chr> <chr> <dbl>
#> 1 soot_asianpiedstarling all high 0.333
#> 2 soot_asianpiedstarling all medium 0.167
#> 3 soot_asianpiedstarling all low 0.5
#> 4 soot_asianpiedstarling isic_4digit high 0.5
#> 5 soot_asianpiedstarling isic_4digit medium 0.167
#> 6 soot_asianpiedstarling isic_4digit low 0.333
#> 7 soot_asianpiedstarling tilt_sector high 0.333
#> 8 soot_asianpiedstarling tilt_sector medium 0.333
#> 9 soot_asianpiedstarling tilt_sector low 0.333
#> 10 soot_asianpiedstarling unit high 0.333
#> # ℹ 1,286 more rows
result |> unnest_product()
#> # A tibble: 2,736 × 7
#> companies_id grouped_by risk_category profile_ranking clustered activity_uuid_product_uuid co2_footprint
#> <chr> <chr> <chr> <dbl> <chr> <chr> <dbl>
#> 1 soot_asianpiedstarling all low 0.111 tent 76269c17-78d6-420b-991a-aa38c51b45b7 0.405
#> 2 soot_asianpiedstarling all high 0.944 tent 76269c17-78d6-420b-991a-aa38c51b45b7 447.
#> 3 soot_asianpiedstarling all medium 0.556 tent 76269c17-78d6-420b-991a-aa38c51b45b7 14.1
#> 4 soot_asianpiedstarling all low 0.333 tent 76269c17-78d6-420b-991a-aa38c51b45b7 0.884
#> 5 soot_asianpiedstarling all high 0.778 tent 76269c17-78d6-420b-991a-aa38c51b45b7 321.
#> 6 soot_asianpiedstarling all low 0.278 tent 76269c17-78d6-420b-991a-aa38c51b45b7 0.675
#> 7 soot_asianpiedstarling isic_4digit low 0.333 tent 76269c17-78d6-420b-991a-aa38c51b45b7 0.405
#> 8 soot_asianpiedstarling isic_4digit high 0.833 tent 76269c17-78d6-420b-991a-aa38c51b45b7 447.
#> 9 soot_asianpiedstarling isic_4digit medium 0.667 tent 76269c17-78d6-420b-991a-aa38c51b45b7 14.1
#> 10 soot_asianpiedstarling isic_4digit high 1 tent 76269c17-78d6-420b-991a-aa38c51b45b7 0.884
#> # ℹ 2,726 more rows
result <- emissions_profile_upstream(companies, upstream_products)
result |> unnest_company()
#> # A tibble: 1,296 × 4
#> companies_id grouped_by risk_category value
#> <chr> <chr> <chr> <dbl>
#> 1 soot_asianpiedstarling all high 0.5
#> 2 soot_asianpiedstarling all medium 0.333
#> 3 soot_asianpiedstarling all low 0.167
#> 4 soot_asianpiedstarling input_isic_4digit high 0
#> 5 soot_asianpiedstarling input_isic_4digit medium 0.167
#> 6 soot_asianpiedstarling input_isic_4digit low 0.833
#> 7 soot_asianpiedstarling input_tilt_sector high 0.167
#> 8 soot_asianpiedstarling input_tilt_sector medium 0.667
#> 9 soot_asianpiedstarling input_tilt_sector low 0.167
#> 10 soot_asianpiedstarling input_unit high 0.167
#> # ℹ 1,286 more rows
result |> unnest_product()
#> # A tibble: 4,140 × 8
#> companies_id grouped_by risk_category profile_ranking clustered activity_uuid_product_uuid input_activity_uuid_product_uuid input_co2_footprint
#> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <dbl>
#> 1 soot_asianpiedstarling all high 0.958 tent 76269c17-78d6-420b-991a-aa38c51b45b7 fdb1f848-173f-5fe1-96a2-588171e87e30_c2c93af2-47cb-4ec7-a1bd-d3d572bca039 144872157.
#> 2 soot_asianpiedstarling all high 0.740 tent 76269c17-78d6-420b-991a-aa38c51b45b7 2c92cdcd-29df-53ba-a209-77c7de201d14_6e316c64-0481-4832-b097-296e14c0b02f 6.08
#> 3 soot_asianpiedstarling all low 0.219 tent 76269c17-78d6-420b-991a-aa38c51b45b7 daef2f9a-4108-52ae-90a7-fe64abad51bc_6e74937e-b691-4c49-9b8f-5ba44d7c081d 0.461
#> 4 soot_asianpiedstarling all medium 0.458 tent 76269c17-78d6-420b-991a-aa38c51b45b7 7361f7fb-5cf2-598c-823a-a4b7e50c3d28_a9007f10-7e39-4d50-8f4a-d6d03ce3d673 0.808
#> 5 soot_asianpiedstarling all high 0.885 tent 76269c17-78d6-420b-991a-aa38c51b45b7 bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 240.
#> 6 soot_asianpiedstarling all medium 0.469 tent 76269c17-78d6-420b-991a-aa38c51b45b7 7361f7fb-5cf2-598c-823a-a4b7e50c3d28_a9007f10-7e39-4d50-8f4a-d6d03ce3d673 0.834
#> 7 soot_asianpiedstarling input_isic_4digit low 0.333 tent 76269c17-78d6-420b-991a-aa38c51b45b7 fdb1f848-173f-5fe1-96a2-588171e87e30_c2c93af2-47cb-4ec7-a1bd-d3d572bca039 144872157.
#> 8 soot_asianpiedstarling input_isic_4digit low 0.333 tent 76269c17-78d6-420b-991a-aa38c51b45b7 2c92cdcd-29df-53ba-a209-77c7de201d14_6e316c64-0481-4832-b097-296e14c0b02f 6.08
#> 9 soot_asianpiedstarling input_isic_4digit medium 0.667 tent 76269c17-78d6-420b-991a-aa38c51b45b7 daef2f9a-4108-52ae-90a7-fe64abad51bc_6e74937e-b691-4c49-9b8f-5ba44d7c081d 0.461
#> 10 soot_asianpiedstarling input_isic_4digit low 0.167 tent 76269c17-78d6-420b-991a-aa38c51b45b7 7361f7fb-5cf2-598c-823a-a4b7e50c3d28_a9007f10-7e39-4d50-8f4a-d6d03ce3d673 0.808
#> # ℹ 4,130 more rows
Created on 2024-01-08 with reprex v2.0.2
@kalashsinghal
Thanks!
RE:
Please add these two columns as well with any random values because they will not be used in tiltIndicator.
I didn't randomize the values as I believe those columns they are not private data in themselves. Do you think they should indeed be random? Do we need to ask Anne?
What IS random and fake in these datasets is this:
companies_id
.activity_uuid_product_uuid
.activity_uuid_product_uuid
and other columns.*co2_footprint
, jittered to the right by 50%-100% on average.FIXME
The *isic*
column has the wrong quoting.
library(readr, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
devtools::load_all()
#> ℹ Loading tiltToyData
options(readr.show_col_types = FALSE, width = 500)
products <- read_csv(toy_emissions_profile_products_ecoinvent())
products |>
relocate(matches("isic"))
#> # A tibble: 18 × 8
#> isic_4digit activity_uuid_product_uuid co2_footprint ei_activity_name ei_geography tilt_sector tilt_subsector unit
#> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr>
#> 1 ''2410'' 833caa78-30df-4374-900f-7f88ab44075b 14.1 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 2 ''2591'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 0.419 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 3 ''4100'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 481. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
#> 4 ''2410'' 833caa78-30df-4374-900f-7f88ab44075b 9.47 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 5 ''2591'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 0.648 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 6 ''4100'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 276. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
#> 7 ''2410'' 833caa78-30df-4374-900f-7f88ab44075b 13.6 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 8 ''2591'' 76269c17-78d6-420b-991a-aa38c51b45b7 0.405 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 9 ''4100'' 76269c17-78d6-420b-991a-aa38c51b45b7 447. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
#> 10 ''2410'' 833caa78-30df-4374-900f-7f88ab44075b 14.7 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 11 ''2591'' 833caa78-30df-4374-900f-7f88ab44075b 0.390 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 12 ''4100'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 442. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
#> 13 ''2410'' 76269c17-78d6-420b-991a-aa38c51b45b7 14.1 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 14 ''2591'' 76269c17-78d6-420b-991a-aa38c51b45b7 0.884 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 15 ''4100'' 76269c17-78d6-420b-991a-aa38c51b45b7 321. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
#> 16 ''2410'' 833caa78-30df-4374-900f-7f88ab44075b 12.7 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 17 ''2591'' 76269c17-78d6-420b-991a-aa38c51b45b7 0.675 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 18 ''4100'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 435. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
upstream_products <- read_csv(toy_emissions_profile_upstream_products_ecoinvent())
upstream_products |>
relocate(matches("isic"))
#> # A tibble: 96 × 11
#> input_isic_4digit activity_uuid_product_uuid ei_activity_name ei_geography input_activity_uuid_product_uuid input_co2_footprint input_ei_activity_name input_reference_product_name input_tilt_sector input_tilt_subsector input_unit
#> <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr>
#> 1 ''2591'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb market for deep drawing, steel, 10000 kN press, automode RoW 55a5ac05-ab15-5a27-9d0e-6ecf840039f1_f10b8722-4be1-43d5-b17d-c51ad0e29d29 4.56e-1 deep drawing, steel, 10000 kN press, automode deep drawing, steel, 10000 kN press, automode metals other metals kg
#> 2 ''4100'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb market for shed, large, wood, non-insulated, fire-unprotected RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 4.63e+2 shed construction, large, wood, non-insulated, fire-unprotected shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> 3 ''2790'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production GLO bdc93cd8-00b4-5b3e-993e-b7fef7059e52_4e584f6f-2e71-4796-931e-bb9a273c161c 1.67e+0 market for anode, for metal electrolysis anode, for metal electrolysis industry machinery & equipment kg
#> 4 ''2815'' 76269c17-78d6-420b-991a-aa38c51b45b7 iron-nickel-chromium alloy production RER fdb1f848-173f-5fe1-96a2-588171e87e30_c2c93af2-47cb-4ec7-a1bd-d3d572bca039 1.45e+8 electric arc furnace converter construction electric arc furnace converter industry other industry unit
#> 5 ''3821'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER 95fcd1bb-4dc6-516a-a3b2-30a4f0530639_3b1d249a-c924-4d6c-8e1f-647f562daa54 5.30e-1 market for electric arc furnace dust electric arc furnace dust industry other industry kg
#> 6 ''3821'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER daef2f9a-4108-52ae-90a7-fe64abad51bc_6e74937e-b691-4c49-9b8f-5ba44d7c081d 5.89e-1 market for electric arc furnace slag electric arc furnace slag industry other industry kg
#> 7 ''3510'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER 3b190359-a32e-5294-af63-983f38ce6525_759b89bd-3aa6-42ad-b767-5bb9ef5d331d 6.02e-1 market group for electricity, medium voltage electricity, medium voltage power total power kWh
#> 8 ''2410'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production GLO 2c92cdcd-29df-53ba-a209-77c7de201d14_6e316c64-0481-4832-b097-296e14c0b02f 7.32e+0 market for ferrochromium, high-carbon, 68% Cr ferrochromium, high-carbon, 68% Cr metals iron & steel kg
#> 9 ''0510'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production Europe, without Russia and Turkey 9392c694-12a6-5cd7-a421-d4866359df2c_0d3eda5a-4601-4573-9549-0701c459ab88 7.10e-1 market for hard coal hard coal energy coal energy kg
#> 10 ''3821'' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production CH c18c6cc9-4a26-5c47-9ea9-8635ff2c158e_240c1a3c-1aba-4528-afc3-3f27f56583be 1.06e-2 market for inert waste, for final disposal inert waste, for final disposal industry other industry kg
#> # ℹ 86 more rows
Created on 2024-01-09 with reprex v2.0.2
FIXED
library(readr, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
devtools::load_all()
#> ℹ Loading tiltToyData
options(readr.show_col_types = FALSE, width = 1000)
products <- read_csv(toy_emissions_profile_products_ecoinvent())
products |> relocate(matches("isic"))
#> # A tibble: 18 × 8
#> isic_4digit activity_uuid_product_uuid co2_footprint ei_activity_name ei_geography tilt_sector tilt_subsector unit
#> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr>
#> 1 '2410' 833caa78-30df-4374-900f-7f88ab44075b 14.1 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 2 '2591' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 0.419 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 3 '4100' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 481. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
#> 4 '2410' 833caa78-30df-4374-900f-7f88ab44075b 9.47 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 5 '2591' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 0.648 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 6 '4100' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 276. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
#> 7 '2410' 833caa78-30df-4374-900f-7f88ab44075b 13.6 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 8 '2591' 76269c17-78d6-420b-991a-aa38c51b45b7 0.405 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 9 '4100' 76269c17-78d6-420b-991a-aa38c51b45b7 447. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
#> 10 '2410' 833caa78-30df-4374-900f-7f88ab44075b 14.7 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 11 '2591' 833caa78-30df-4374-900f-7f88ab44075b 0.390 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 12 '4100' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 442. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
#> 13 '2410' 76269c17-78d6-420b-991a-aa38c51b45b7 14.1 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 14 '2591' 76269c17-78d6-420b-991a-aa38c51b45b7 0.884 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 15 '4100' 76269c17-78d6-420b-991a-aa38c51b45b7 321. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
#> 16 '2410' 833caa78-30df-4374-900f-7f88ab44075b 12.7 iron-nickel-chromium alloy production RER metals iron & steel kg
#> 17 '2591' 76269c17-78d6-420b-991a-aa38c51b45b7 0.675 market for deep drawing, steel, 10000 kN press, automode GLO metals other metals kg
#> 18 '4100' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 435. market for shed, large, wood, non-insulated, fire-unprotected GLO construction construction residential m2
inputs <- read_csv(toy_emissions_profile_upstream_products_ecoinvent())
inputs |> relocate(matches("isic"))
#> # A tibble: 96 × 11
#> input_isic_4digit activity_uuid_product_uuid ei_activity_name ei_geography input_activity_uuid_product_uuid input_co2_footprint input_ei_activity_name input_reference_product_name input_tilt_sector input_tilt_subsector input_unit
#> <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr>
#> 1 '2591' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb market for deep drawing, steel, 10000 kN press, automode RoW 55a5ac05-ab15-5a27-9d0e-6ecf840039f1_f10b8722-4be1-43d5-b17d-c51ad0e29d29 4.56e-1 deep drawing, steel, 10000 kN press, automode deep drawing, steel, 10000 kN press, automode metals other metals kg
#> 2 '4100' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb market for shed, large, wood, non-insulated, fire-unprotected RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 4.63e+2 shed construction, large, wood, non-insulated, fire-unprotected shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> 3 '2790' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production GLO bdc93cd8-00b4-5b3e-993e-b7fef7059e52_4e584f6f-2e71-4796-931e-bb9a273c161c 1.67e+0 market for anode, for metal electrolysis anode, for metal electrolysis industry machinery & equipment kg
#> 4 '2815' 76269c17-78d6-420b-991a-aa38c51b45b7 iron-nickel-chromium alloy production RER fdb1f848-173f-5fe1-96a2-588171e87e30_c2c93af2-47cb-4ec7-a1bd-d3d572bca039 1.45e+8 electric arc furnace converter construction electric arc furnace converter industry other industry unit
#> 5 '3821' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER 95fcd1bb-4dc6-516a-a3b2-30a4f0530639_3b1d249a-c924-4d6c-8e1f-647f562daa54 5.30e-1 market for electric arc furnace dust electric arc furnace dust industry other industry kg
#> 6 '3821' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER daef2f9a-4108-52ae-90a7-fe64abad51bc_6e74937e-b691-4c49-9b8f-5ba44d7c081d 5.89e-1 market for electric arc furnace slag electric arc furnace slag industry other industry kg
#> 7 '3510' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER 3b190359-a32e-5294-af63-983f38ce6525_759b89bd-3aa6-42ad-b767-5bb9ef5d331d 6.02e-1 market group for electricity, medium voltage electricity, medium voltage power total power kWh
#> 8 '2410' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production GLO 2c92cdcd-29df-53ba-a209-77c7de201d14_6e316c64-0481-4832-b097-296e14c0b02f 7.32e+0 market for ferrochromium, high-carbon, 68% Cr ferrochromium, high-carbon, 68% Cr metals iron & steel kg
#> 9 '0510' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production Europe, without Russia and Turkey 9392c694-12a6-5cd7-a421-d4866359df2c_0d3eda5a-4601-4573-9549-0701c459ab88 7.10e-1 market for hard coal hard coal energy coal energy kg
#> 10 '3821' bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production CH c18c6cc9-4a26-5c47-9ea9-8635ff2c158e_240c1a3c-1aba-4528-afc3-3f27f56583be 1.06e-2 market for inert waste, for final disposal inert waste, for final disposal industry other industry kg
#> # ℹ 86 more rows
Created on 2024-01-09 with reprex v2.0.2
@AnneSchoenauer, today you asked if the new *products_ecoinvent
datasets have *uuid
that don't match companies
and the other way around.
This reprex shows that the answer is yes. For completion I also show *uuid
that do match.
@kalashsinghal please see if this is what you exect.
library(readr, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
devtools::load_all()
#> ℹ Loading tiltToyData
options(readr.show_col_types = FALSE, width = 1000)
companies <- read_csv(toy_emissions_profile_any_companies())
products <- read_csv(toy_emissions_profile_products_ecoinvent())
# *uuid in companies that match *uuid in products
left_join(companies, products, relationship = "many-to-many") |>
print() |>
distinct(activity_uuid_product_uuid)
#> Joining with `by = join_by(activity_uuid_product_uuid, ei_activity_name, unit)`
#> # A tibble: 155 × 12
#> activity_uuid_product_uuid clustered companies_id country ei_activity_name main_activity unit co2_footprint ei_geography isic_4digit tilt_sector tilt_subsector
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr>
#> 1 76269c17-78d6-420b-991a-aa38c51b45b7 tent soot_asianpiedstarling germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2 447. GLO '4100' construction construction residential
#> 2 76269c17-78d6-420b-991a-aa38c51b45b7 tent soot_asianpiedstarling germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2 321. GLO '4100' construction construction residential
#> 3 76269c17-78d6-420b-991a-aa38c51b45b7 table hire for parties frightening_chrysomelid spain market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2 447. GLO '4100' construction construction residential
#> 4 76269c17-78d6-420b-991a-aa38c51b45b7 table hire for parties frightening_chrysomelid spain market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2 321. GLO '4100' construction construction residential
#> 5 76269c17-78d6-420b-991a-aa38c51b45b7 surface finishing, galvanic hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg 0.405 GLO '2591' metals other metals
#> 6 76269c17-78d6-420b-991a-aa38c51b45b7 surface finishing, galvanic hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg 0.884 GLO '2591' metals other metals
#> 7 76269c17-78d6-420b-991a-aa38c51b45b7 surface finishing, galvanic hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg 0.675 GLO '2591' metals other metals
#> 8 76269c17-78d6-420b-991a-aa38c51b45b7 surface engineering hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg 0.405 GLO '2591' metals other metals
#> 9 76269c17-78d6-420b-991a-aa38c51b45b7 surface engineering hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg 0.884 GLO '2591' metals other metals
#> 10 76269c17-78d6-420b-991a-aa38c51b45b7 surface engineering hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg 0.675 GLO '2591' metals other metals
#> # ℹ 145 more rows
#> # A tibble: 3 × 1
#> activity_uuid_product_uuid
#> <chr>
#> 1 76269c17-78d6-420b-991a-aa38c51b45b7
#> 2 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb
#> 3 833caa78-30df-4374-900f-7f88ab44075b
# *uuid in companies that do NOT match *uuid in products
anti_join(companies, products) |>
print() |>
distinct(activity_uuid_product_uuid)
#> Joining with `by = join_by(activity_uuid_product_uuid, ei_activity_name, unit)`
#> # A tibble: 4 × 7
#> activity_uuid_product_uuid clustered companies_id country ei_activity_name main_activity unit
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 833caa78-30df-4374-900f-7f88ab44075b garden fittings weak_meadowlark netherlands market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2
#> 2 833caa78-30df-4374-900f-7f88ab44075b garden fittings arrogant_ewe netherlands market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2
#> 3 833caa78-30df-4374-900f-7f88ab44075b tent pseudoeconomical_easternglasslizard germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2
#> 4 833caa78-30df-4374-900f-7f88ab44075b tent charterable_wren germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2
#> # A tibble: 1 × 1
#> activity_uuid_product_uuid
#> <chr>
#> 1 833caa78-30df-4374-900f-7f88ab44075b
# *uuid in products that do NOT match *uuid in companies
anti_join(products, companies) |>
print() |>
distinct(activity_uuid_product_uuid)
#> Joining with `by = join_by(activity_uuid_product_uuid, ei_activity_name, unit)`
#> # A tibble: 8 × 8
#> activity_uuid_product_uuid co2_footprint ei_activity_name ei_geography isic_4digit tilt_sector tilt_subsector unit
#> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 833caa78-30df-4374-900f-7f88ab44075b 14.1 iron-nickel-chromium alloy production RER '2410' metals iron & steel kg
#> 2 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 0.419 market for deep drawing, steel, 10000 kN press, automode GLO '2591' metals other metals kg
#> 3 833caa78-30df-4374-900f-7f88ab44075b 9.47 iron-nickel-chromium alloy production RER '2410' metals iron & steel kg
#> 4 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb 0.648 market for deep drawing, steel, 10000 kN press, automode GLO '2591' metals other metals kg
#> 5 833caa78-30df-4374-900f-7f88ab44075b 13.6 iron-nickel-chromium alloy production RER '2410' metals iron & steel kg
#> 6 833caa78-30df-4374-900f-7f88ab44075b 14.7 iron-nickel-chromium alloy production RER '2410' metals iron & steel kg
#> 7 833caa78-30df-4374-900f-7f88ab44075b 0.390 market for deep drawing, steel, 10000 kN press, automode GLO '2591' metals other metals kg
#> 8 833caa78-30df-4374-900f-7f88ab44075b 12.7 iron-nickel-chromium alloy production RER '2410' metals iron & steel kg
#> # A tibble: 2 × 1
#> activity_uuid_product_uuid
#> <chr>
#> 1 833caa78-30df-4374-900f-7f88ab44075b
#> 2 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb
companies <- read_csv(toy_emissions_profile_any_companies())
inputs <- read_csv(toy_emissions_profile_upstream_products_ecoinvent())
# *uuid in companies that match *uuid in inputs
left_join(companies, inputs, relationship = "many-to-many") |>
print() |>
distinct(activity_uuid_product_uuid)
#> Joining with `by = join_by(activity_uuid_product_uuid, ei_activity_name)`
#> # A tibble: 97 × 16
#> activity_uuid_product_uuid clustered companies_id country ei_activity_name main_activity unit ei_geography input_activity_uuid_product_uuid input_co2_footprint input_ei_activity_name input_isic_4digit input_reference_product_name input_tilt_sector input_tilt_subsector input_unit
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 76269c17-78d6-420b-991a-aa38c51b45b7 tent soot_asianpiedstarling germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2 RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 240. shed construction, large, wood, non-insulated, fire-unprotected '4100' shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> 2 76269c17-78d6-420b-991a-aa38c51b45b7 table hire for parties frightening_chrysomelid spain market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2 RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 240. shed construction, large, wood, non-insulated, fire-unprotected '4100' shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> 3 76269c17-78d6-420b-991a-aa38c51b45b7 surface finishing, galvanic hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg <NA> <NA> NA <NA> <NA> <NA> <NA> <NA> <NA>
#> 4 76269c17-78d6-420b-991a-aa38c51b45b7 surface engineering hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg <NA> <NA> NA <NA> <NA> <NA> <NA> <NA> <NA>
#> 5 76269c17-78d6-420b-991a-aa38c51b45b7 tent flexible_dolphin austria market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2 RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 240. shed construction, large, wood, non-insulated, fire-unprotected '4100' shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> 6 76269c17-78d6-420b-991a-aa38c51b45b7 tent paramilitary_racerunner germany market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2 RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 240. shed construction, large, wood, non-insulated, fire-unprotected '4100' shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> 7 76269c17-78d6-420b-991a-aa38c51b45b7 open space amenities level_meadowhawk france market for shed, large, wood, non-insulated, fire-unprotected wholesaler m2 RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 240. shed construction, large, wood, non-insulated, fire-unprotected '4100' shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> 8 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb tent heartrending_attwatersprairiechicken germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2 RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 463. shed construction, large, wood, non-insulated, fire-unprotected '4100' shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> 9 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb tent heartrending_attwatersprairiechicken germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2 RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 451. shed construction, large, wood, non-insulated, fire-unprotected '4100' shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> 10 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb tent heartrending_attwatersprairiechicken germany market for shed, large, wood, non-insulated, fire-unprotected distributor m2 RoW bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df 447. shed construction, large, wood, non-insulated, fire-unprotected '4100' shed, large, wood, non-insulated, fire-unprotected construction construction residential m2
#> # ℹ 87 more rows
#> # A tibble: 3 × 1
#> activity_uuid_product_uuid
#> <chr>
#> 1 76269c17-78d6-420b-991a-aa38c51b45b7
#> 2 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb
#> 3 833caa78-30df-4374-900f-7f88ab44075b
# *uuid in companies that do NOT match *uuid in inputs
anti_join(companies, inputs) |>
print() |>
distinct(activity_uuid_product_uuid)
#> Joining with `by = join_by(activity_uuid_product_uuid, ei_activity_name)`
#> # A tibble: 4 × 7
#> activity_uuid_product_uuid clustered companies_id country ei_activity_name main_activity unit
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 76269c17-78d6-420b-991a-aa38c51b45b7 surface finishing, galvanic hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg
#> 2 76269c17-78d6-420b-991a-aa38c51b45b7 surface engineering hyperbrutal_flea germany market for deep drawing, steel, 10000 kN press, automode distributor kg
#> 3 76269c17-78d6-420b-991a-aa38c51b45b7 deep-drawn metal part humanoid_elkhound germany market for deep drawing, steel, 10000 kN press, automode agent/ representative kg
#> 4 76269c17-78d6-420b-991a-aa38c51b45b7 drawn parts humanoid_elkhound germany market for deep drawing, steel, 10000 kN press, automode agent/ representative kg
#> # A tibble: 1 × 1
#> activity_uuid_product_uuid
#> <chr>
#> 1 76269c17-78d6-420b-991a-aa38c51b45b7
# *uuid in inputs that do NOT match *uuid in companies
anti_join(inputs, companies) |>
print() |>
distinct(activity_uuid_product_uuid)
#> Joining with `by = join_by(activity_uuid_product_uuid, ei_activity_name)`
#> # A tibble: 85 × 11
#> activity_uuid_product_uuid ei_activity_name ei_geography input_activity_uuid_product_uuid input_co2_footprint input_ei_activity_name input_isic_4digit input_reference_product_name input_tilt_sector input_tilt_subsector input_unit
#> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb market for deep drawing, steel, 10000 kN press, automode RoW 55a5ac05-ab15-5a27-9d0e-6ecf840039f1_f10b8722-4be1-43d5-b17d-c51ad0e29d29 0.456 deep drawing, steel, 10000 kN press, automode '2591' deep drawing, steel, 10000 kN press, automode metals other metals kg
#> 2 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production GLO bdc93cd8-00b4-5b3e-993e-b7fef7059e52_4e584f6f-2e71-4796-931e-bb9a273c161c 1.67 market for anode, for metal electrolysis '2790' anode, for metal electrolysis industry machinery & equipment kg
#> 3 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER 95fcd1bb-4dc6-516a-a3b2-30a4f0530639_3b1d249a-c924-4d6c-8e1f-647f562daa54 0.530 market for electric arc furnace dust '3821' electric arc furnace dust industry other industry kg
#> 4 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER daef2f9a-4108-52ae-90a7-fe64abad51bc_6e74937e-b691-4c49-9b8f-5ba44d7c081d 0.589 market for electric arc furnace slag '3821' electric arc furnace slag industry other industry kg
#> 5 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER 3b190359-a32e-5294-af63-983f38ce6525_759b89bd-3aa6-42ad-b767-5bb9ef5d331d 0.602 market group for electricity, medium voltage '3510' electricity, medium voltage power total power kWh
#> 6 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production GLO 2c92cdcd-29df-53ba-a209-77c7de201d14_6e316c64-0481-4832-b097-296e14c0b02f 7.32 market for ferrochromium, high-carbon, 68% Cr '2410' ferrochromium, high-carbon, 68% Cr metals iron & steel kg
#> 7 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production Europe, without Russia and Turkey 9392c694-12a6-5cd7-a421-d4866359df2c_0d3eda5a-4601-4573-9549-0701c459ab88 0.710 market for hard coal '0510' hard coal energy coal energy kg
#> 8 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production CH c18c6cc9-4a26-5c47-9ea9-8635ff2c158e_240c1a3c-1aba-4528-afc3-3f27f56583be 0.0106 market for inert waste, for final disposal '3821' inert waste, for final disposal industry other industry kg
#> 9 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production RER c4ec0b1e-2a3b-5700-871c-2adbbb29bc1d_4f312355-ac65-4635-8fb2-006dba64ce60 0.0581 market for iron scrap, sorted, pressed '3830' iron scrap, sorted, pressed industry other industry kg
#> 10 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb iron-nickel-chromium alloy production CH 7361f7fb-5cf2-598c-823a-a4b7e50c3d28_a9007f10-7e39-4d50-8f4a-d6d03ce3d673 1.22 market for natural gas, high pressure '3520' natural gas, high pressure energy gas energy m3
#> # ℹ 75 more rows
#> # A tibble: 2 × 1
#> activity_uuid_product_uuid
#> <chr>
#> 1 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb
#> 2 833caa78-30df-4374-900f-7f88ab44075b
Created on 2024-01-09 with reprex v2.0.2
@Tilmon,
Resuming today's conversation about making sure that the toy data is public.
@kalashsinghal added a few columns that didn't exist in the old datasets (comparison between the columns in old versus new datasets). This means we'll publish some columns which privacy we didn't discuss before. In particular Kalash was worried about the *activity_name
columns.
I look forward to documenting the details of licensed columns. But from what you said today I believe the toy datasets in this PR are OK (but see CAVEAT below) because among other features (see them all in the top comment) these datasets have:
activity_uuid_product_uuid
.activity_uuid_product_uuid
and other columns.The implementation of these features happen in the package tiltToyDataPrivate, via the function randimize_uuid()
(source, application). If you explore its source code you'll see that it first replaces *uuid
with fake values and then shuffles them (via sample()
) to break the link between activity_uuid_product_uuid
and other columns.
The following reprex shows the result. Focus on the relationship between the colum clustered
and ei_activity_name
. That relationship doesn't always make sense, giving evidence that the link is broken.
library(readr, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
devtools::load_all()
#> ℹ Loading tiltToyData
options(readr.show_col_types = FALSE, width = 1000)
companies <- read_csv(toy_emissions_profile_any_companies())
products <- read_csv(toy_emissions_profile_products_ecoinvent())
left_join(companies, products, relationship = "many-to-many") |>
select(matches(c("uuid", "activity_name")), clustered)
#> Joining with `by = join_by(activity_uuid_product_uuid, ei_activity_name, unit)`
#> # A tibble: 155 × 3
#> activity_uuid_product_uuid ei_activity_name clustered
#> <chr> <chr> <chr>
#> 1 76269c17-78d6-420b-991a-aa38c51b45b7 market for shed, large, wood, non-insulated, fire-unprotected tent
#> 2 76269c17-78d6-420b-991a-aa38c51b45b7 market for shed, large, wood, non-insulated, fire-unprotected tent
#> 3 76269c17-78d6-420b-991a-aa38c51b45b7 market for shed, large, wood, non-insulated, fire-unprotected table hire for parties
#> 4 76269c17-78d6-420b-991a-aa38c51b45b7 market for shed, large, wood, non-insulated, fire-unprotected table hire for parties
#> 5 76269c17-78d6-420b-991a-aa38c51b45b7 market for deep drawing, steel, 10000 kN press, automode surface finishing, galvanic
#> 6 76269c17-78d6-420b-991a-aa38c51b45b7 market for deep drawing, steel, 10000 kN press, automode surface finishing, galvanic
#> 7 76269c17-78d6-420b-991a-aa38c51b45b7 market for deep drawing, steel, 10000 kN press, automode surface finishing, galvanic
#> 8 76269c17-78d6-420b-991a-aa38c51b45b7 market for deep drawing, steel, 10000 kN press, automode surface engineering
#> 9 76269c17-78d6-420b-991a-aa38c51b45b7 market for deep drawing, steel, 10000 kN press, automode surface engineering
#> 10 76269c17-78d6-420b-991a-aa38c51b45b7 market for deep drawing, steel, 10000 kN press, automode surface engineering
#> # ℹ 145 more rows
inputs <- read_csv(toy_emissions_profile_upstream_products_ecoinvent())
left_join(companies, inputs, relationship = "many-to-many") |>
select(matches(c("uuid", "activity_name")), clustered)
#> Joining with `by = join_by(activity_uuid_product_uuid, ei_activity_name)`
#> # A tibble: 97 × 5
#> activity_uuid_product_uuid input_activity_uuid_product_uuid ei_activity_name input_ei_activity_name clustered
#> <chr> <chr> <chr> <chr> <chr>
#> 1 76269c17-78d6-420b-991a-aa38c51b45b7 bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df market for shed, large, wood, non-insulated, fire-unprotected shed construction, large, wood, non-insulated, fire-unprotected tent
#> 2 76269c17-78d6-420b-991a-aa38c51b45b7 bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df market for shed, large, wood, non-insulated, fire-unprotected shed construction, large, wood, non-insulated, fire-unprotected table hire for parties
#> 3 76269c17-78d6-420b-991a-aa38c51b45b7 <NA> market for deep drawing, steel, 10000 kN press, automode <NA> surface finishing, galvanic
#> 4 76269c17-78d6-420b-991a-aa38c51b45b7 <NA> market for deep drawing, steel, 10000 kN press, automode <NA> surface engineering
#> 5 76269c17-78d6-420b-991a-aa38c51b45b7 bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df market for shed, large, wood, non-insulated, fire-unprotected shed construction, large, wood, non-insulated, fire-unprotected tent
#> 6 76269c17-78d6-420b-991a-aa38c51b45b7 bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df market for shed, large, wood, non-insulated, fire-unprotected shed construction, large, wood, non-insulated, fire-unprotected tent
#> 7 76269c17-78d6-420b-991a-aa38c51b45b7 bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df market for shed, large, wood, non-insulated, fire-unprotected shed construction, large, wood, non-insulated, fire-unprotected open space amenities
#> 8 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df market for shed, large, wood, non-insulated, fire-unprotected shed construction, large, wood, non-insulated, fire-unprotected tent
#> 9 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df market for shed, large, wood, non-insulated, fire-unprotected shed construction, large, wood, non-insulated, fire-unprotected tent
#> 10 bf94b5a7-b7a2-46d1-bb95-84bc560b12fb bc548877-9cc6-590d-ba72-1d1d2daeb5b9_e2ccc500-255f-448c-8c88-ed25177993df market for shed, large, wood, non-insulated, fire-unprotected shed construction, large, wood, non-insulated, fire-unprotected tent
#> # ℹ 87 more rows
The caveat is that the toy data is very small so the pool of *uuid
that we shuffled is small too, meaning that it's not hard to re-arrange the broken link using just common sense.
If this worries you, we can discuss how to continue before we merge this PR. A quick alternative for now might be able to remove the additional columns if @kalashsinghal thinks we can live without them for now (I don't know how exactly they will be used).
If this worries you, we can discuss how to continue before we merge this PR. A quick alternative for now might be able to remove the additional columns if @kalashsinghal thinks we can live without them for now (I don't know how exactly they will be used).
@maurolepore Anne needed these additional columns in the past for her own analysis on the tiltIndicatorBefore outputs. Its not feasible to use any untraceable code to add these columns and then provide the output to Anne's analysis in the future. Hence, I highly recommend to have these columns in the final output of tiltIndicatorBefore.
Also, @maurolepore You can use any fake data to add these columns in the toy data irrespective of how these additional columns are linked to the licensed data. I am asking to do so because these additional columns are not used in tiltIndicator and tiltIndicatorAfter in any way, and also an external user don't need to see the real values of such columns. right @Tilmon? Please confirm this! Thanks! :)
@maurolepore - thanks a lot for taking already the initiatve with seeing if there are some uuids not in one dataset but in the ohter. I created before I saw your comment here this ticket with a reprex to show you why this requirement is so important. Maybe it also helps you. You can close it if it is all clear.
For your questions to @Tilmon - wasn't this the reason why we now have the jitter function as well for the co2 data? However, yes most likely not covering all licensed data. Good to make here a proper check before publishing .
@maurolepore Anne needed these additional columns in the past for her own analysis on the tiltIndicatorBefore outputs. ... these additional columns are not used in tiltIndicator and tiltIndicatorAfter in any way
Thanks @kalashsinghal. The usage of those columns seems internal -- equivalent to a developer-oriented internal function that users don't need to know about.
@AnneSchoenauer to refresh your memory, this PR proposes to introduce the following new columns in public toy datasets (for details see also Comparing columns between old and new datasets above):
# emissions_profile_any_companies
#> [1] "country" "ei_activity_name" "main_activity"
# emissions_profile_products_ecoinvent
#> [1] "ei_geography"
# emissions_profile_upstream_products_ecoinvent
#> [1] "ei_activity_name" "ei_geography"
#> [3] "input_ei_activity_name" "input_reference_product_name"
Do you need these columns to appear in the public toy datasets that we use in all our websites to shows examples to our users?
From what Kalash says they are not useful for most users of our packages. We can keep them and ensure they don't expose private data, but it's best to make public as little as possible. It's always best to release as little stuff as possible. Adding things later is easy. Removing them later is hard -- we need to go through an expensive deprecation process to ensure backward compatibility.
BTW, thanks for that ticket. Sorry I missed it. I'll continue that conversation there.
@Tilmon what do you think? These data points are not needed for the tiltIndicator but are needed for the output files. So they are needed to produce the outputs from tiltIndicatorAfter. I hear @maurolepore though that it is expensive to have them in in the toydataset. @Tilmon what do you think is better from a transparency and usability perspective?
@AnneSchoenauer
These data points ... are needed for the output files
Okay, then this suggests a more public usage than what I understand from Kalash's comments. If users expect those columns in the output then they seem to deserve a place in the toy datasets. And tiltIndicatorAfter should have a test to ensure these columns exist in the output (cc' @kalashsinghal).
Assuming the columns stay, then the last question we need to resolve is this:
See https://github.com/2DegreesInvesting/tiltToyData/pull/19#issuecomment-1883788387
I look forward to your answers so we can merge this PR ASAP and and close the many related issues.
Great!
@Tilmon could you please answer this here. I think you are best invovled with the licenses issues in ecoinvent.
Thanks!!
These data points are not needed for the tiltIndicator but are needed for the output files. So they are needed to produce the outputs from tiltIndicatorAfter.
@AnneSchoenauer Based on your comment here, Should I ensure that all these extra columns from tiltIndicatorBefore also be present in the final output from tiltIndicatorAfter? If yes, then I will create a separate ticket for it in tiltIndicatorAfter package. FYI: At the moment not all of those extra columns are present in the final output.
@kalashsinghal yes sounds good!
Hi @AnneSchoenauer @kalashsinghal @maurolepore sorry for the late response and thanks for already clarifying almost everything.
Re
I look forward to documenting the details of licensed columns. But from what you said today I believe the toy datasets in this PR are OK (but see CAVEAT below) because among other features (see them all in the top comment) these datasets have:
- Fake activity_uuid_product_uuid.
- Random mapping between fake activity_uuid_product_uuid and other columns.
That's fine! It's important that we don't share the real co2 data, which we don't, because it's jittered. Also it's important to not share which input_activity_uuid_product_uuid
belong to which activity_uuid_product_uuid
, which is not the case because:
Note for future reference (will also send this in an email as discussed to the whole team): the activity_uuid_product_uuid
is defined by the combination activity_name x reference_product
x geography
x main_activity
. Meaning that if you fake the activity_uuid_product_uuid
but show all other four columns users can create the activity_uuid_product_uuid
by themselves.
In the toy data that you referenced, not all 4 columns aside from the activity_uuid_product_uuid
are shown, so all is fine.
@maurolepore you pointed out that the toy data is small and one could try to re-shuffle it to get licensed information but they won't be able to define the exact activity_uuid_product_uuid
as too many of the other 4 variables are missing.
Thanks for your efforts to create the toydata in a compliant way with our license agreement @maurolepore and @kalashsinghal !
@Tilmon thanks a lot. Very clear :) You are right the information is not shown in the toydata set. But in our output (so in the very end) people will get results with the toy data set (using tiltWorkflows) which will contain all the four data points that you mentioned right?
@maurolepore do you see this as a problem!
@AnneSchoenauer, I think the best person to answer is @Tilmon. But if the final output will have all the columns and the shuffling doesn't break licensed links, then it seems like a problem. If so let me know which columns we could replace with fake values.
Yes I agree - @Tilmon we can also discuss later but I think it will be a problem indeed.
@AnneSchoenauer good point. I checked the tiltIndicator and tiltIndicatorAfter outputs and what I see is that in
activity_uuid_product_uuid
and input_activity_uuid_product_uuid
. This information should not be available to users without a license. Aside from the input_activity_uuid_product_uuid
, there is no input-related variable to identify the unique input. How can we solve that not both activity_uuid_product_uuid
and input_activity_uuid_product_uuid
are shown here? @maurolepore could we create an internal ID mapping, where we create our own ID for each input_activity_uuid_product_uuid
and just show our own ID instead of ecoinvent's input_activity_uuid_product_uuid
? For users with license, we could share the mapper of our own IDs with input_activity_uuid_product_uuid
.input_activity_uuid_product_uuid
, but only the input-related variables input_name
, input_unit
, input_tilt_sector
, input_tilt_subsector
, input_isic_4digit
, input_isic_4digit_name
which is not enough info to define a unique input_activity_uuid_product_uuid
.What do you think @AnneSchoenauer @maurolepore? Does it make sense to create fake IDs for the users and keep the real IDs for ourselves?
Yes it does! @maurolepore do you think you could do this?
We are getting there!!!
@Tilmon I would like to add that columns ei_activity_name
and ei_input_geography
will also be added to the final output of emission profile upstream indicator at product level. Can they cause any issue?
Dear all,
Although it seems like this conversation still needs to continue to make our toy datasets perfect, I go with the saying "perfect is the enemy of the good". This PR already represents a significant improvement and if anything private is exposed, it is less than before.
The condition to merge a PR is that (1) it doesn't make things worse and (2) it does make things better. So I went ahead and merged this PR. This allows us to move on with the very many related issues that depend on this one.
In short, as the top comment says the main features of this PR are these:
activity_uuid_product_uuid
.fake activity_uuid_product_uuid
and other columns.*co2_footprint
, jittered to the right by 50%-100% on average.We can now start from good and more to perfect here: #24
Closes #7 Closes #20 Closes 2DegreesInvesting/tiltIndicator#566 Closes #22 Relates to 2DegreesInvesting/tiltToyDataPrivate#1
emissions*products_ecoinvent
emissions_profile_companies_any_companies
profile_ranking
given that we currently calculate it internally in tiltIndicator. See https://github.com/2DegreesInvesting/tiltIndicator/pull/644/files#r1442397853Features
companies_id
.activity_uuid_product_uuid
.activity_uuid_product_uuid
and other columns.*co2_footprint
, jittered to the right by 50%-100% on average.reprex
Access deprecated data. It may still be necessary for a while. For example, tiltIndicatorAfter may need to update other toy datasets before they can match the new datasets.
New datasets
Created on 2024-01-05 with reprex v2.0.2
TODO
EXCEPTIONS