Closed krivit closed 1 year ago
@AryaKarami, in terms of options, one was to "outsource" to sna::rgraph()
if sna
were installed (testing via requireNamespace()
). The other was to roll our own, using some of these approaches:
rgeom()
, then use cumsum()
to compute the dyad indices and translate them to tails and heads.numedges
), then select dyad indices using sample.int()
and translate them to tails and heads.I didn't mention Approach 2 during our meeting, but it occurred to me after.
Oh yes please. I hit that wall myself in the past.
Does it make more sense to have it in network rather than ergm?
@AryaKarami , I think Approach 2 will probably work better, because it reduces density case to the numedges
case.
Also, since the largest integer that can be stored as double is apparently 2^53, the code needs to check that the number of dyads (potential edges) in the network does not exceed that number, producing an error otherwise.
Does it make more sense to have it in network rather than ergm?
Possibly. At the same time, I think to include it in network, we should have it work for all possible situations that network will support, including hypergraphs and multigraphs, whereas in ergm we are justified in only supporting the cases ergm()
handles. @CarterButts?
Does it make more sense to have it in network rather than ergm?
Possibly. At the same time, I think to include it in network, we should have it work for all possible situations that network will support, including hypergraphs and multigraphs, whereas in ergm we are justified in only supporting the cases
ergm()
handles. @CarterButts?
Well, in the long term, maybe.
We already have efficient and flexible Bernoulli graph production in sna
(rgraph
), which makes use of the usual fast Bernoulli graph tricks (and uses edgelists). Duplicating that in network
or other statnet
packages doesn't make a lot of sense. (To be honest, network
really shouldn't even have that as.network.numeric()
method, in my opinion. Random graph generation is a modeling feature, and should not be in the base data type package.) @krivit is right that we should ideally have support for non-dyadic other special cases for network
functions, though as noted this is functionality that was sort of stuck in there in the first place. I am loathe to remove it because it's been there forever, and some folks presumably find it handy for testing or demo purposes. On the other hand, I'm also not so thrilled at the prospect of making more investment in it - someone who wants to generate random graphs in any serious capacity ought (IMHO) to be using sna
or ergm
, and not relying on the as.network.numeric()
hack. Given that statnet
already has tools for the purpose, adding better random graph generation to network
looks like mission creep. Is there an affirmative reason to do it? (I.e., a real use case where someone can't use the other statnet
tools, and really needs to produce huge Bernoulli graphs in network
per se?)
@CarterButts, as.network.numeric()
is a part of ergm
. The reasons we aren't using sna
outright are that 1) it would create a hard dependency, and 2) it doesn't handle bipartite networks, as far as I can tell.
@krivit you are right - my brain is apparently broken, and I had confabulated that it had been placed in network
at some point. Setting aside implications for my mental state, I obviously don't think that moving this to network
is a good idea. I can see the concern about multiplying dependencies. One idea is to port over the backend from rgraph
, which is pretty simple; however, it does make use of some memory structure utilities that are in the sna
backend and may not fit the ergm
backend idiom. Would probably be easy enough to adapt.
Generating Directed Bipartite network?
Thank you all for the ideas, especially for the idea in approach 2 (Binomial Distribution), it was interesting to implement. Now, another question arises; is it necessary to generate directed Bipartite Graphs in "ergm"? In fact, in as. network () when the user specifies the graph as a bipartite, the function considers it as an undirected bipartite and sets directed <- FALSE by default.
For example, what we can do is first generate the undirected one, and then randomly set the direction of edges between partitions (This is the first approach that came to my mind and I am not sure whether it is efficient).
Created on 2020-12-24 by the reprex package (v0.3.0)