MangoTheCat / functionMap

Draw the functions map for a R package
36 stars 8 forks source link

error in sankey_plot #42

Open kasperdanielhansen opened 8 years ago

kasperdanielhansen commented 8 years ago

When I run it on the minfi package on Github (https://github.com/kasperdanielhansen/minfi), I get an error when plotting the output of map_r_package():

library(functionMap)
out = map_r_package("minfi") # checkout from Github
sankey_plot(out)

results in

 sankey_plot(out)
Error in sanitize.simplegraph_adjlist(structure(x, class = c("simplegraph_adjlist",  :
  Duplicated names in adjacency list
> traceback()
16: stop("Duplicated names in adjacency list")
15: sanitize.simplegraph_adjlist(structure(x, class = c("simplegraph_adjlist",
        "simplegraph", class(x))))
14: sanitize(structure(x, class = c("simplegraph_adjlist", "simplegraph",
        class(x))))
13: graph.list(adjlist)
12: graph(adjlist)
11: as_graph_adjlist.simplegraph_df(graph)
10: as_graph_adjlist(graph)
9: predecessors(sgraph)
8: vapply(predecessors(sgraph), length, 1L)
7: optimize_sizes(nodes, edges)
6: nodes[["size"]] %||% optimize_sizes(nodes, edges)
5: make_sankey(node_data, edge_data, break_edges = TRUE)
4: as_graph_data_frame(graph)
3: vertices(x)
2: sankey(make_sankey(node_data, edge_data, break_edges = TRUE),
       ...)
1: sankey_plot(out)
kasperdanielhansen commented 8 years ago

Perhaps a clue is in the start of the out print:

> out
──────────────────────────────────────────────────── Map of R package 'minfi' ──
   _ ❯  methods::setGeneric
   _ ❯  methods::setGeneric
   _ ❯  methods::setGeneric
   _ ❯  methods::setGeneric
   _ ❯  methods::setGeneric
   _ ❯  methods::setGeneric
   _ ❯  methods::setGeneric
   _ ❯  methods::setGeneric
   _ ❯  methods::setGeneric
   .availableAnnotation ❯ ★ getAnnotationObject
   .betaFromMethUnmeth ❯
   .buildControlMatrix450k ❯  stats::na.omit
   .checkAssayNames ❯ assays
   .checkSex ❯
   .digestMatrix ❯  digest::digest
   .digestVector ❯  digest::digest
<SNIP>
jsta commented 8 years ago

I think the issue is that duplicated node names need to be pruned or merged in the process of building the sankey graph. When I insert a debugger breakpoint at 13: graph.list(adjlist) and run:

names(x)[duplicated(names(x))]:

> [1] "_"

x$`_`

> [1] "utils::read.csv" "utils::read.csv"

which(names(x) == names(x)[which(duplicated(names(x)))])

> [1] 14 25

x[14]

> [1] "utils::read.csv" "utils::read.csv"

x[25]

> character(0)

gaborcsardi commented 8 years ago

Thanks for looking into this. Yes, the different _ elements are pieces of code outside of functions. They should be replaced by a single _ for the plot.