christophergandrud / networkD3

D3 JavaScript Network Graphs from R
http://christophergandrud.github.io/networkD3
650 stars 269 forks source link

nodeID same order as the Source variable column #236

Closed nick-youngblut closed 6 years ago

nick-youngblut commented 6 years ago

I'm using networkD3 for the first time, and it's pretty great, but the nodeID <--> edgeID mapping for forceNetwork() is very confusing (at least to me). As far as I can tell, the edges must be specified with numeric (zero-indexed) source and target values. So, the node data.frame must then somehow be matched to the edge (links) dataframe in order to ID the nodes correctly in the network. This edge <--> node mapping can be done with the NodeID parameter, but this NodeID must match the edge IDs, so the nodeID with be just a numeric value. This numeric value will be displayed for each node, which isn't very informative.

The other way, according to the forceNetwork docs:

If no ID is specified then the nodes must be in the same order as the Source variable column in the Links data frame

The problem with this is that some nodes may only be specified in the source column, but instead just in the target column. The example for issue #190 represents such as case, where 2 of the nodes are just specified in the target column and not the source column.

I am missing something in the docs? Do I have to create self-self edges in order to have all nodes in the source column?

cjyetman commented 6 years ago

If you have nodes in links$target that are not in links$source, then you must use the first method... numeric values in links$target and links$source refer to the index of the node in nodes (0-indexed)

nick-youngblut commented 6 years ago

Thanks for the quick reply. When you say:

index of the node in nodes (0-indexed)

do you mean row index? That doesn't seem to make sense, given that R uses 1-indexing for everything. index usually refers to row, column, or position in a vector, so I'm not seeing exactly what you mean here.

cjyetman commented 6 years ago

yes, index of the node in the nodes data frame (0-indexed)

It's 0-indexed because the data is used by JavaScript, which uses 0-based indexing

cjyetman commented 6 years ago

So if you have a 'nodeA' that's connected to all three of 'nodeB', 'nodeC', and 'nodeD', your data should look like...

R-index JavaScript-index name
1 0 nodeA
2 1 nodeB
3 2 nodeC
4 3 nodeD
source target value
0 1 1
0 2 1
0 3 1