christophergandrud / networkD3

D3 JavaScript Network Graphs from R
http://christophergandrud.github.io/networkD3
652 stars 269 forks source link

as_treenetdf.data.frame should handle implicit root nodes better #225

Open cjyetman opened 6 years ago

cjyetman commented 6 years ago

Ideally, data passed to as_treenetdf.data.frame should include information about every node, including the root node, such as...

library(networkD3)
data <- read.csv(header = T, text = "
nodeId,parentId
root,
L2-1,root
L2-2,root
L3-1,L2-1
L3-2,L2-1
L3-3,L2-2
L3-4,L2-2
")
as_treenetdf(data)

In some cases, a user may expect that the root node will be inferred from the data. This could be expected with numerous possibilities...

  1. root node is listed as the parentId of multiple child nodes, but not explicitly listed itself as a nodeId

    library(networkD3)
    data <- read.csv(header = T, text = "
    nodeId,parentId
    L2-1,root
    L2-2,root
    L3-1,L2-1
    L3-2,L2-1
    L3-3,L2-2
    L3-4,L2-2
    ")
    as_treenetdf(data)
  2. multiple nodes do not have a parentID node explicitly named, and therefore should be considered the children of a root node

    library(networkD3)
    data <- read.csv(header = T, text = "
    nodeId,parentId
    L2-1,
    L2-2,
    L3-1,L2-1
    L3-2,L2-1
    L3-3,L2-2
    L3-4,L2-2
    ")
    as_treenetdf(data)

    or

    library(networkD3)
    data <- read.csv(header = T, text = "
    nodeId,parentId
    L2-1,NA
    L2-2,NA
    L3-1,L2-1
    L3-2,L2-1
    L3-3,L2-2
    L3-4,L2-2
    ")
    as_treenetdf(data)

It could be that the data represent two separate branches and do not share a root node, but since the plotting function does not support plotting two separate branches, maybe it's safe to assume there's a shared root node?

I noticed this problem while investigating the example given in #224