cynkra / dm

Working with relational data models in R
https://dm.cynkra.com
Other
501 stars 49 forks source link

decompose_table() with user-specified keys #73

Open krlmlr opened 5 years ago

krlmlr commented 5 years ago

Should work when the user provides a custom key that is unique. Currently:

library(dplyr)
library(dm)

nycflights13::planes %>% 
  add_count(model, engines, name = "n_eng") %>% 
  add_count(model, seats, name = "n_seats") %>% 
  add_count(model, manufacturer, name = "n_manufacturer") %>% 
  mutate(model_id = paste0(
    model,
    if_else(n_eng > 1, paste0("-", engines), ""),
    if_else(n_seats > 1, paste0("-", seats), ""),
    if_else(n_manufacturer > 1, paste0("-", abbreviate(manufacturer)), "")
  )) %>% 
  decompose_table(model_id, model, manufacturer, type, engines, seats, manufacturer)
#> `new_id_column` can not have an identical name as one of the columns of
#> `.`.

Created on 2019-09-24 by the reprex package (v0.3.0)

TSchiefer commented 5 years ago

problem is that you already have a column called model_id (new_id_column is 2nd argument to decompose_table())

TSchiefer commented 2 years ago

I still don't understand this issue. It works, if you just choose a different name for the argument new_id_column:

suppressPackageStartupMessages({
  library(dplyr)
  library(dm)
})

nycflights13::planes %>% 
  add_count(model, engines, name = "n_eng") %>% 
  add_count(model, seats, name = "n_seats") %>% 
  add_count(model, manufacturer, name = "n_manufacturer") %>% 
  mutate(model_id = paste0(
    model,
    if_else(n_eng > 1, paste0("-", engines), ""),
    if_else(n_seats > 1, paste0("-", seats), ""),
    if_else(n_manufacturer > 1, paste0("-", abbreviate(manufacturer)), "")
  )) %>% 
  decompose_table(model_id_new, model, manufacturer, type, engines, seats, manufacturer)
#> $child_table
#> # A tibble: 3,322 × 9
#>    tailnum  year speed engine n_eng n_seats n_manufacturer model_id model_id_new
#>    <chr>   <int> <int> <chr>  <int>   <int>          <int> <chr>           <int>
#>  1 N10156   2004    NA Turbo…   104     104            104 EMB-145…          120
#>  2 N102UW   1998    NA Turbo…    82      82             21 A320-21…           93
#>  3 N103US   1999    NA Turbo…    82      82             21 A320-21…           93
#>  4 N104UW   1999    NA Turbo…    82      82             21 A320-21…           93
#>  5 N10575   2002    NA Turbo…   114     114            114 EMB-145…          119
#>  6 N105UW   1999    NA Turbo…    82      82             21 A320-21…           93
#>  7 N107US   1999    NA Turbo…    82      82             21 A320-21…           93
#>  8 N108UW   1999    NA Turbo…    82      82             21 A320-21…           93
#>  9 N109UW   1999    NA Turbo…    82      82             21 A320-21…           93
#> 10 N110UW   1999    NA Turbo…    82      82             21 A320-21…           93
#> # … with 3,312 more rows
#> 
#> $parent_table
#> # A tibble: 147 × 6
#>    model_id_new model       manufacturer     type                  engines seats
#>           <int> <chr>       <chr>            <chr>                   <int> <int>
#>  1          120 EMB-145XR   EMBRAER          Fixed wing multi eng…       2    55
#>  2           93 A320-214    AIRBUS INDUSTRIE Fixed wing multi eng…       2   182
#>  3          119 EMB-145LR   EMBRAER          Fixed wing multi eng…       2    55
#>  4           39 737-824     BOEING           Fixed wing multi eng…       2   149
#>  5           68 767-332     BOEING           Fixed wing multi eng…       2   330
#>  6           52 757-224     BOEING           Fixed wing multi eng…       2   178
#>  7           94 A320-214    AIRBUS           Fixed wing multi eng…       2   182
#>  8          112 CL-600-2D24 BOMBARDIER INC   Fixed wing multi eng…       2    95
#>  9           30 737-724     BOEING           Fixed wing multi eng…       2   149
#> 10           27 737-524     BOEING           Fixed wing multi eng…       2   149
#> # … with 137 more rows

Created on 2022-04-05 by the reprex package (v2.0.1.9000)

krlmlr commented 2 years ago

This is about reusing an existing natural key, currently decompose_table() always creates a surrogate key. Can we decompose (sorry...)?

TSchiefer commented 2 years ago

Of course, that makes sense, thanks!