2DegreesInvesting / tiltIndicator

Implement the core business logic of the tilt indicators
https://2degreesinvesting.github.io/tiltIndicator/
GNU General Public License v3.0
1 stars 1 forks source link

Create script in TiltIndicator package that does quick validation on input formats #519

Closed bobresink closed 1 year ago

maurolepore commented 1 year ago

Here are some examples of how we currently handle some kinds of errors:

library(tibble)
library(dplyr, warn.conflicts = FALSE)
library(readr, warn.conflicts = FALSE)
library(tiltIndicator)
library(tiltToyData)
packageVersion("tiltIndicator")
#> [1] '0.0.0.9094'

options(readr.show_col_types = FALSE)

companies <- read_csv(toy_emissions_profile_any_companies())
products <- read_csv(toy_emissions_profile_products())

If any dataset lacks crucial columns, you get an error

companies_lacks_column <- companies |> select(-company_id)
try(
  companies_lacks_column |> emissions_profile(products)
)
#> Error in map(.x, .f, ..., .progress = .progress) : ℹ In index: 1.
#> Caused by error in `check_matches_name()`:
#> ! The data lacks a column matching the pattern 'company_id'.
#> ℹ Are you using the correct data?

products_lacks_column <- products |> select(-ends_with("uuid"))
try(
  companies |> emissions_profile(products_lacks_column)
)
#> Error in left_join(companies, data, by = aka("uid"), relationship = "many-to-many") : 
#>   Join columns in `y` must be present in the data.
#> ✖ Problem with `activity_uuid_product_uuid`.

If any dataset lacks crucial columns, you get an error

# "anything_rowid" is valid but "rowid" isn not
# Good
companies_with_rowid <- companies |> tibble::rowid_to_column("companies_rowid")
companies_with_rowid |> emissions_profile(products)
#> # A tibble: 8 × 3
#>   companies_id                             product           company          
#>   <chr>                                    <list>            <list>           
#> 1 fleischerei-stiefsohn_00000005219477-001 <tibble [12 × 6]> <tibble [18 × 3]>
#> 2 pecheries-basques_fra316541-00101        <tibble [6 × 6]>  <tibble [18 × 3]>
#> 3 hoche-butter-gmbh_deu422723-693847001    <tibble [6 × 6]>  <tibble [18 × 3]>
#> 4 vicquelin-espaces-verts_fra697272-00101  <tibble [6 × 6]>  <tibble [18 × 3]>
#> 5 bst-procontrol-gmbh_00000005104947-001   <tibble [6 × 6]>  <tibble [18 × 3]>
#> 6 leider-gmbh_00000005064318-001           <tibble [6 × 6]>  <tibble [18 × 3]>
#> 7 cheries-baqu_neu316541-00101             <tibble [6 × 6]>  <tibble [18 × 3]>
#> 8 ca-coity-trg-aua-gmbh_00000384-001       <tibble [1 × 6]>  <tibble [3 × 3]>
# Bad
companies_has_reserved_rowid <- companies |> tibble::rowid_to_column("rowid")
try(
  companies_has_reserved_rowid |> emissions_profile(products)
)
#> Error in abort_reserved_name(reserved, hint = hint_specify_rowid()) : 
#>   The name `rowid` is reserved.
#> ℹ Do you need to create a table-specific name, e.g. `companies_rowid`?

Created on 2023-09-12 with reprex v2.0.2

maurolepore commented 1 year ago

Oops, one of the erros is not the one I expect (https://github.com/2DegreesInvesting/tiltIndicator/issues/521)

maurolepore commented 1 year ago

I believe we've done this already. But feel free to reopen if the checks you need are not implemented.