Closed pedroj closed 10 years ago
I have no experience at all with bipartite networks, but what you are suggesting seems perfectly doable. If the ggnet
function gets integrated, as I hope it will, into the GGally
package, my next release will offer what you suggest. For now, I unfortunately do not have the time to work on any major new feature.
May I also suggest that you take a look at the gggraph
function mentioned in issue #2? It seems to work with bipartite networks :)
Hi Pedro,
I have added some bipartite scripts to a network example to show how the function can work with some bipartite networks, like the one below. The main trick for me was to produce the network through matrix algebra. Red points are bills and blue points are MPs who collaborated on the bills.
This graph, however, uses only a small sample (< 5%) of the data that I have produced: I still have to find a way to "flatten" larger graphs with multiple edges into an unweighted adjacency matrix. Dabbling with intergraph
and network
has failed to show me how to do that so far.
I see François. That's great. And yes, plotting bipartite structures yields a mess for very large graphs. One alternative is to use a matrix representation, i.e., a 'heatmap' of the adjacency matrix- this is useful for weighted incidence matrices. Even using the heat map on a version of the adjacency matrix where rows and cols have been rearranged according to a modularity algorithm could also highlight some patterns. This would yield a fattened view of the plot above. You lose the 'distance' sense that the FM ordination yields but you gain clarity in visualizing where the strong and weak links occur (in a weighted matrix). Another way is to use plot.web in the bipartite package. I find this useful as a first snapshot and as a useful way to represent multipartite structures.
BTW, the use of ggnet function is as I suggested, plotting the two modes in different colors (even with a different shape). That's great.
One thing... when you say "to find a way to "flatten" larger graphs with multiple edges into an unweighted adjacency matrix", do you mean to 'binarize' the weighted matrix? That's seems simple. I use a modification of a function by Diego Vazquez:
quant2bin<-function(matr)
{
cn=colnames(matr)
rn=rownames(matr)
ij<-which(matr!=0,arr.ind=T)
matrb<-matrix(0,dim(matr)[1],dim(matr)[2])
matrb[ij]<-1
colnames(matrb)=cn
rownames(matrb)=rn
return(matrb)
}
You can set any threshold value in ij<-which(matr!=0,arr.ind=T)
.
I forgot to mention that package NetPreProc can be useful to you for flattening: Network Pre-Processing and normalization. http://www.rdocumentation.org/packages/NetPreProc/functions/Sparsify.matrix-methods
Thanks a lot Pedro, lots of helpful ideas. This script generates the sparse matrix that I use to get the plot above:
> head(A)
6 x 1500 sparse Matrix of class "dgTMatrix"
[[ suppressing 1500 column names ‘1’, ‘2’, ‘3’ ... ]]
Boyer 1 . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 . . . . . . . . . . . . . 1 .
Cherpion 1 . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . 1 .
Cinieri 1 . . 1 . . . . . . . . . 1 . . . 1 . . . . . . . . . 1 . . . . . . . . . . . . . 1 .
Fromion 1 . . . . . . . . . . . . . . . . 1 . 1 . . . . . . 1 1 1 . . . . . . . . . . . . 1 .
Lazaro 1 . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 . . . . . . . . . . . . . 1 .
Le Callennec 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . 1 .
So the matrix is fine as far as I can tell. Things are also okay after the matrix is converted to an undirected bipartite graph in this script (the multiple
argument is actually FALSE
by default but is typed for clarity):
> iA <- graph.incidence(A, mode = "all", multiple = FALSE)
> iA
IGRAPH UN-B 2082 22659 --
+ attr: type (v/x), name (v/c)
The issue arises when I convert that structure to a network
object and then try to ask for its adjacency matrix, because network
does not (yet?) support converting multiple edges to a sparser representation:
Error in as.matrix.network.adjacency(net) :
Multigraphs not currently supported in as.matrix.network.adjacency. Exiting.
I have given a try to Sparsify.matrix
, but the NetPreProc
package has fatal dependency issues.
I've added some code I had to plot bipartite networks. It is in my bipartite_plots repo. This is not with ggoplot2 actually, but there is a prototype for using ggplot2 with bipartite networks. I've added more explanations in the pages.
Cher François,
have you tried just using write.graph
from igraph to export the network to an external file (e.g., as an edgelist) and then re-read it in network
, getting the adjacency matrix? It's a bit cumbersome and not elegant, but can be useful.
write.graph(graph, file, format=c("edgelist", "pajek", "ncol", "lgl", "graphml", "dimacs", "gml", "dot", "leda"), ...)
Previously I also had difficulties with as.matrix.network.adjacency
.
All the best, -- Pedro
El 08/07/2013, a las 00:34, François notifications@github.com escribió:
Thanks a lot Pedro, lots of helpful ideas. This script generates the sparse matrix that I use to get the plot above:
head(A) 6 x 1500 sparse Matrix of class "dgTMatrix" [[ suppressing 1500 column names ‘1’, ‘2’, ‘3’ ... ]]
Boyer 1 . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 . . . . . . . . . . . . . 1 . Cherpion 1 . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . 1 . Cinieri 1 . . 1 . . . . . . . . . 1 . . . 1 . . . . . . . . . 1 . . . . . . . . . . . . . 1 . Fromion 1 . . . . . . . . . . . . . . . . 1 . 1 . . . . . . 1 1 1 . . . . . . . . . . . . 1 . Lazaro 1 . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 . . . . . . . . . . . . . 1 . Le Callennec 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . 1 .
So the matrix is fine as far as I can tell. Things are also okay after the matrix is converted to an undirected bipartite graph in this script:
iA <- graph.incidence(A, mode = "all", multiple = FALSE) iA IGRAPH UN-B 2082 22659 --
- attr: type (v/x), name (v/c)
The issue arises when I convert that structure to a network object and then try to ask for its adjacency matrix, because network does not (yet?) support converting multiple edges to a sparser representation:
Error in as.matrix.network.adjacency(net) : Multigraphs not currently supported in as.matrix.network.adjacency. Exiting.
I have given a try to Sparsify.matrix, but the NetPreProc package has fatal dependency issues.
— Reply to this email directly or view it on GitHub.
Hi Pedro,
Fantastic stuff! I believe you have succeeded at doing two different things that I failed at:
ggnet
.igraph
workaround. I have used igraph
in a similar way in this script.Give me a bit of time to parse your work and I should hopefully be back with improvements to ggnet
based on your code.
Excellent! Please feel free to use any material- of course. Let's keep in touch. I'm really very interested in developing analyses with bipartite networks.
À bientôt, -- Pedro
El 10/07/2013, a las 14:32, François notifications@github.com escribió:
Hi Pedro,
Fantastic stuff! I believe you have succeeded at doing two different things that I failed at:
• Weight the edges. Moritz Marbach's original code and yours are both more advanced than mine on that point. This should be easy to add to ggnet. • Work with bipartite/multiple graphs. I'll take a careful look at your code and igraph workaround. I have used igraph in a similar way in this script. Give me a bit of time to parse your work and I should hopefully be back with improvements to ggnet based on your code.
— Reply to this email directly or view it on GitHub.
I have started looking at your code. Here's a Gist that plugs your bip_init_network
and vectorize
functions into ggnet
to plot the bipartite graph out of your NCH example:
This plot did not even require to update the ggnet
function and should work straight out of the box if you have the downloader
, ggplot2
, network
etc. packages installed. I have checked whether this works with all other ggnet
options: it does, except for subset.threshold
(which is expected, and can surely be fixed with little effort).
Do you believe that this method should be integrated into ggnet
, and if so, how? For example, the function could auto-detect bipartite networks and automatically do what the Gist above does, provided a bipartite
argument to set the mode names. Not sure what to do about edge weights, it seems like a different game to me.
I'll get back to you if I manage to make your method work with my neta
examples of amendment co-sponsorship networks. It's a much, much larger network, and it uses a sparse matrix to store the adjacencies, so that will also be an interesting test to run.
Update: it works, but only on small samples. Here's a demo with 5,500 nodes:
My full two-mode network has 500+ Members of Parliament (actors) and 11,500+ amendments (events). I have managed to plot up to 7,500 amendments, but network
does not deal very well with large networks and creates a memory leak that crashes R if you try creating a network out of larger samples.
Incidentally, the last issue of Social Networks is on two-mode social networks.
Thanks François! Yes, I think the bip coding can be fully implemented in ggnet. However, I don't think it would be useful that the function detects the bipartite structure. Any user can input a bipartite network but have the goal of analyzing the one-mode projections. That option can be useful in packages like bipartite
where you obviously have objectives of analyzing bipartite structures- in this circumstance having an automatic detection can help.
It is very interesting that network
hangs when dealing with large bipartite networks. Maybe you can open an issue with the authors of the package, to keep them informed.
Big thanks for pointing out the special issue of Social Networks! I think bipartite networks are under-studied in various aspects. But social scientists are certainly ahead in the study and analysis of this type of network.
Thanks for that last piece of feedback. I have performed further tests with network
and have not yet managed to reproduce the issue, but I'll keep an eye at it.
So far, your vectorize
function works perfectly with ggnet
, and I agree that there is no need to make things more complicated at that stage. The function is good to go as such.
I'll work further on bipartite networks when I find the time to – thanks for a very interesting conversation over the code and the challenges of network analysis!
Hi Pedro,
I'm working again on ggnet
while updating this repo, and have started digging your code again.
Regarding bipartite plots, this little snippet creates a mode vector for a network
object. It should be easy enough to add a bipartite
argument to ggnet
that would accept TRUE
or k
to detect the mode or set it manually, or FALSE
or 0
to ignore the bipartite
attribute if present. Let me know if you have other ideas.
g = network::network.initialize(15, bipartite = 5) #Initialize the network
network::is.bipartite(g)
if (g$gal$bipartite > 0) {
node.group = rep("Event", network.size(g))
node.group[ 1:g$gal$bipartite ] = "Actor"
}
node.group # mode vector
Looks great François! I'll have a look in detail asap... If the bipartite graph is built (input) from an adjacency matrix, then the sizes of the two modes correspond to the number of rows and columns. If it is input as an edge list it should also be straightforward to get the two modes.
The code I posted has a glitch: I cannot get either network
or igraph
to plot the edges, because I don't seem to have created them with network.initialize
. I'm not sure how to get a random bipartite network from network
to implement the idea further (I'd like to test it on a 'natural' network
object).
You might want to have a try with function swap.web in package bipartite. i've used these functions too:
# Random matrix
randommat <- function (m, n, prob=c(0.92, 0.08)) {
size=m*n
matrix(sample(c(0,1), size, replace=T, prob),
nrow=m, ncol=n, byrow = TRUE)
}
# Fully nested matrix
fullynest <- function (m, n) {
x<- as.vector(rep(1, n))
for (i in 1:m) {
j=1
x <- rbind(x,c(rep(1, n-j), rep(0, j)))}
x<-data.frame(x,row.names = NULL)
# x[-(m+1), ]
}
Thanks for the functions. I just pushed the cleaned up version of the code, with better centrality measures for one-mode networks but nothing yet for two-mode graphs. I'm dealing with very large networks, so it takes a bit of time to test everything out. The code won't even run on 32-bit R because it creates a vector that exceeds 2.4GB of memory…
thanks François! It would be great to have specific functions to handle bipartite networks. With the exception of the bipartite package most existing R packages do not include simple ways to plot bipartite structures. I'd suggest a quick way to code the two modes of the bipartite network, with distinct shape/colour combinations for instance.