Closed kilimba closed 9 years ago
Hm, there must be some issue in your numbering. I've been meaning to create a function that can convert R numbering to js. Just subtracting 1 doesn't do it for reasons that I haven't had time to fully work through.
I can't tell for sure but it seems that you are getting stuck in an infinite loop due to cycles (see https://github.com/d3/d3-plugins/issues/1). As a quick check, this is what I did.
library(curl)
library(networkD3)
edges <- read.csv(
curl("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv")
,stringsAsFactors = F
)
edges[,1] <- paste0("source_", edges[,1])
edges[,2] <- paste0("target_", edges[,2])
nodes = data.frame(ID = unique(c(edges$Source, edges$Target)))
nodes$indx =0
for (i in 1:nrow(nodes)){
nodes[i,]["indx"] = i - 1
}
edges2 <- merge(edges,nodes,by.x = "Source",by.y = "ID")
edges2$Source <-NULL
names(edges2) <- c("target","value","source")
edges2 <- merge(edges2,nodes,by.x = "target",by.y = "ID")
edges2$target <- NULL
names(edges2) <- c("value","source","target")
nodes$indx <- NULL
# Plot
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, fontsize = 12, nodeWidth = 30)
@christophergandrud, perhaps we should look at pulling in this Sankey version which can handle cycles.
To verify the existence of cycles even further, we can use igraph
.
library(curl)
library(networkD3)
library(igraph)
edges <- read.csv(
curl("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv")
,stringsAsFactors = F
)
g <- graph.data.frame(edges)
is.dag(g)
Perhaps, this is another resource that can help enlighten on cycles, DAG
, and Sankey.
Thanks timelyportfolio, most helpful! An aesthetic question, after trying your updated code the display is quite crowded with spaghetti lines of links and flows. How difficult would it be to go from this two "tier" (for lack of a better word, not sure if tier is the correct term here) graph to a 3 "tier" graph. i.e from
15-19 Male -------------------------------->> 15-19 Female
15-19 Female ----------------------------->> 15-19 Male
....
to
15-19 Male ----------------------------->> 15-19 Female ------------------------->>15-19 Male
....
Is this possible? It would greatly decongest the chart and make it much clearer.
As always, thanks for the help!
Thanks @timelyportfolio for all of your work on this. I'm wondering if we should wait until the Sankey fork gets merged into the master? In the meantime we could point to this issue?
I hope I did not mislead. I never intended for the above to be a full solution. Rather, it was an illustration of the problem. Let's say we can assume that the data is in order of levels. We could do something like this if my assumptions are correct (note: manual and not generalizable).
library(curl)
library(networkD3)
edges <- read.csv(
curl("https://raw.githubusercontent.com/kilimba/data/master/infection_flows.csv")
,stringsAsFactors = F
)
edges[80:nrow(edges),]$Target = paste0(edges[80:nrow(edges),]$Target,"_")
nodes = data.frame(ID = unique(c(edges$Source, edges$Target)))
nodes$indx =0
for (i in 1:nrow(nodes)){
nodes[i,]["indx"] = i - 1
}
edges2 <- merge(edges,nodes,by.x = "Source",by.y = "ID")
edges2$Source <-NULL
names(edges2) <- c("target","value","source")
edges2 <- merge(edges2,nodes,by.x = "target",by.y = "ID")
edges2$target <- NULL
names(edges2) <- c("value","source","target")
nodes$indx <- NULL
# Plot
sankeyNetwork(Links = edges2, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "ID",
width = 700, fontsize = 12, nodeWidth = 30)
The problem though is then the data out from level 2 does not sum to the data in from level 1, so you see this. Of course, this can be handled through data manipulation or it indicates that I do not understand (probably more likely).
@christophergandrud based on watching the d3-plugins
repo over the years (see old pull requests), it is not maintained very well and pull requests often languish, so if we want the functionality, I think we should go ahead and use the forked version.
Hm, that is good information to have. I'm just worried about how basing the package on a fork might lead to issues longer-term. Do you think this fork is likely to be easily compatible with future versions of d3-plugins
?
I also ran into this issue, and the suggestion to use is.dag
was very helpful. If it's not possible to support data with cycles, I would encourage adding an is.dag
check to sankeyNetwork
, alerting the user that the data set is not supported and preventing the infinite loop.
That seems like a reasonable suggestion. Thoughts @timelyportfolio?
@christophergandrud up to you as to how many dependencies you are willing to accept. If we update to the forked Sankey, then the user will be able to visually see what is happening and not get stuck in a loop. I could try to update the Sankey to the forked if you would like and submit a pull.
@timelyportfolio This sounds like it is worth a shot.
hi all, I'm the author of the fork https://github.com/soxofaan/d3-plugin-captain-sankey which is mentioned here I haven't read the complete discussion here, but just wanted to inform:
Thanks for the information
On Mon, Jun 22, 2015 at 2:39 PM Stefaan Lippens notifications@github.com wrote:
hi all, I'm the author of the fork https://github.com/soxofaan/d3-plugin-captain-sankey which is mentioned here I haven't read the complete discussion here, but just wanted to inform:
- that the d3-plugin is probably is never going to be updated: see d3/d3-plugins#133 https://github.com/d3/d3-plugins/issues/133 and https://twitter.com/mbostock/status/600670626785792000
- I don't think there is already an "official"/"blessed" alternative at the moment, so my fork isn't neither, I just started it because I needed some updates and wanted to move forward
— Reply to this email directly or view it on GitHub https://github.com/christophergandrud/networkD3/issues/45#issuecomment-114087612 .
With pull #79, I think this is resolved so closing. Happy to reopen though if not resolved. Thanks everyone.
As I did this post, I realized I still need to do a little more to get the cycle Sankey to be complete.
Hello, Am relatively new to R and was trying to plot a Sankey diagram using the networkD3 library. However, all I get is a blank screen. The diagram is supposed to show the flow of infections between age groups (by gender). My code is as below:
Any help greatly appreciated. Tumaini