bergant / datamodelr

Data model diagrams in R
Other
285 stars 28 forks source link

rendering more complex foreign keys #5

Open jessekps opened 6 years ago

jessekps commented 6 years ago

Hi, thanks for the package, it has already been quite helpfull.

I have an sqlite database which I'm trying to model ( see code below ) but I'm having two small issues with more complex foreign keys.

  1. If one column in a table is part of two or more foreign keys, it gets multiplied in the graph (at least in my code). I solve that rather hackishly below but I wonder if there is a better way or if I'm doing something wrong in the first place.
  2. For composite foreign keys, I get multiple parallel arrows. I'd rather have just one arrow. I have not been able to solve that one so any tips are appreciated.
library(dplyr)
library(tibble)
library(RSQLite)

db = dbConnect(SQLite(), dbpath)

inf = lapply(dbListTables(db), function(tb)
{
  fk = dbGetQuery(db, paste0("pragma foreign_key_list('", tb, "')")) %>%
    mutate(from = as.character(from))

  dbGetQuery(db, paste0("pragma table_info('", tb, "')")) %>%
    left_join(fk, by = c(name = 'from')) %>%
    mutate(key = as.integer(pk>0)) %>%
    select(column = 'name', key, ref = 'table', ref_col = 'to', mandatory = 'notnull', 
              type, column_order='cid') %>%
    add_column(table=tb, .before=1)
}) %>%
  bind_rows() 

dm = as.data_model(inf)

# same column referencing different tables multiplies the columns
# correct by changing the dm object
dm$columns = dm$columns %>%
  distinct(table, column, .keep_all=TRUE)

graph <- dm_create_graph(dm, rankdir = "BT", col_attr = c("column", "type"))
dm_render_graph(graph)

kind regards, Jesse

bergant commented 6 years ago

I have to check if there is a better way about not displaying duplicating columns in display (first issue). For the second issue (composite foreign keys), it should work as expected if you are using sequence numbers in the key column. For example:

library(datamodelr)

meta = read.table(header = TRUE, stringsAsFactors = FALSE, text = "
table        column          key   ref
invoice_line invoice_id      1     NA
invoice_line line_id         2     NA
invoice_line line            NA    NA
invoice_line amount          NA    NA
material     id              1     NA
material     invoice_id      NA    invoice_line
material     line_id         NA    invoice_line
")

dm = as.data_model(meta)
dm_create_graph(dm)

image

You can check also implied references (ref_id should be same for both columns):

dm$references

       table     column          ref    ref_col ref_id ref_col_num
> 6 material invoice_id invoice_line invoice_id      1           1
> 7 material    line_id invoice_line invoice_id      1           2
jessekps commented 6 years ago

Thanks, that put me on the right track. It also turned out that some of my arrows were doubled because they were pointing to the wrong table. I've tried to make a minimal example of that. The following code solves both the composite key arrow problem in my sqlite case (which is a little hard to get into the minimal example) and the arrows pointing to the wrong table.

meta = read.table(header = TRUE, stringsAsFactors = FALSE, text = "
table           column          key   ref
administrations person_id       1     NA
administrations booklet_id      1     NA
booklet_design  booklet_id      1     NA
booklet_design  item_id         1     NA
booklet_design  item_position   0     NA
responses       person_id       1     administrations
responses       booklet_id      1     booklet_design
responses       booklet_id      1     administrations
responses       item_id         1     scoring_rules
responses       item_id         1     booklet_design
responses       response        0     scoring_rules
scoring_rules   item_id         1     NA
scoring_rules   response        1     NA
scoring_rules   item_score      0     NA
")

dm = as.data_model(meta)
dm$columns = dm$columns %>% 
  distinct(table, column, .keep_all=TRUE)

# two arrows point to administrations and none to scoring rules
dm_create_graph(dm)

# this solves all problems
dm$references = dm$references %>%
  group_by(table) %>%
  mutate(ref_id = dense_rank(ref)) %>%
  ungroup() %>%
  group_by(table,ref_id) %>%
  mutate(ref_col_num = row_number()) %>%
  ungroup()

dm_create_graph(dm)

Interestingly, rearranging the input data solves the pointing to the wrong table problem as well, though not the general composite foreign key issue I had.

meta2 = meta %>% arrange(table, ref)

dm2 = as.data_model(meta2)
dm2$columns = dm2$columns %>% 
    distinct(table, column, .keep_all=TRUE)
dm_create_graph(dm2)

So this solves all of my problems at the moment. I hope you can incorporate it more directly. If you could also support sqlite in your package that would be a nice feature by the way.

adamrobinson361 commented 6 years ago

Similar problem from me and the following solved it as per @jessekps solution.

dm2$columns = dm2$columns %>% 
    distinct(table, column, .keep_all=TRUE)

Thanks both! Would be good to see a fix and potentially more symbols for tables with multiple foreign keys.

Thanks again for thispackage!