Open krlmlr opened 3 years ago
We also need dm_update_fk()
when we allow foreign keys to non-primary keys.
We might have something like this to start with :
dm <-
dm(players =
data.frame(player = c("John", "Tom", "Max")),
matches =
data.frame(
player1 = c("John", "Tom"),
player2 = c("Tom", "Max")
),
events =
data.frame(
event = c("Brussels", "Brussels", "Zurich"),
event_player = c("John", "Tom", "John")
)) %>%
dm_add_pk(players, player) %>%
dm_add_fk(matches, player1, players) %>%
dm_add_fk(matches, player2, players) %>%
dm_add_fk(events, event_player, players)
dm_draw(dm, view_type = "all")
Created on 2021-09-22 by the reprex package (v2.0.1)
And say we wish to end up with the following :
dm_out <-
dm(players =
data.frame(
id = 1:3,
player = c("John", "Tom", "Max")),
matches =
data.frame(
player1 = c("John", "Tom"),
player2 = c("Tom", "Max"),
player1_id = 1:2,
player2_id = 2:3
),
events =
data.frame(
event = c("Brussels", "Brussels", "Zurich"),
event_player = c("John", "Tom", "John"),
player_id = c(1,2,3)
)) %>%
dm_add_pk(players, id) %>%
dm_add_fk(matches, player1_id, players) %>%
dm_add_fk(matches, player2_id, players) %>%
dm_add_fk(events, player_id, players)
dm_draw(dm_out, view_type = "all")
Created on 2021-09-22 by the reprex package (v2.0.1)
Manual renaming might work like this :
# dm_update_pk(dm, tbl, ..., col_name = "id", rm_old_cols = c("none", "all", "fk_only"))
dm_update_pk(
dm, players,
players = c(player1_id = "player1", player2_id = "player2"),
events = c(event_player = "player_id")
It might be semi manual, so if there is only one fk in the table the new fk in foreign tables will be named based on the name of the primary table.
# dm_update_pk(dm, tbl, ..., col_name = "id", default_pk_name = NULL, rm_old_cols = c("none", "all", "fk_only"))
# by default `default_pk_name` might be be `PrimaryTableName_id`
dm_update_pk(
dm, players,
players = c(player1_id = "player1", player2_id = "player2"))
It might be simpler to provide a suffix to be applied on existing table names though, we'd check that it works everywhere, fail if it doesn't. The user would be free to rename whatever doesn't suit (either before applying dm_update_pk
, so it doesn't fail, or after applying it, to tweak the given names).
It might work like this :
# dm_update_pk(dm, tbl, col_name = "id", suffix = "_id", rm_old_cols = c("none", "all", "fk_only"))
dm_update_pk(dm, players)
In this case, events.event_player
will be renamed to event_player_id
, so we might want to rename it manually to player_id
but all the rest worked automatically.
For the column names, can we use a glue pattern similar to e.g. the .name_spec
argument to vctrs::vec_c()
? We'd fail if the pattern leads to duplicate column names. We could also provide placeholders that expand to the child column name only if there's more than one foreign key in a table.
How do we deal with compound keys?
The macro should not be responsible for creating the new id
column in the parent table, it must already exist.
Logic: define a new primary key column that trickles down to the foreign tables, optionally remove existing (primary and foreign) key columns.
Name:
dm_update_pk()
?Building block of #502.