Closed smjenness closed 5 years ago
I've fixed the error, in the first case, but I sometimes get premature convergence. For some reason, -Inf doesn't prohibit concurrent ties completely. I'll investigate.
I think that what's happening is that [50,0] is in a corner of the convex hull---having the most edges possible while still having no concurrency---and while ergm
can detect that for concurrency, it can't detect it for edge count, so it needs to estimate it, but it doesn't take much to get a step length of 1 twice in a row, so the Hummel convergence criterion tells it to stop.
The first one. I am still getting to the bottom of that case.
That said, this limitation might be inherent in the Hummel stopping criterion.
The [50,5] case is an artefact of MPLE. Here's the degree distribution of a network that san()
generates:
degree0 degree1 degree4 degree5 degree8
20 75 3 1 1
Notice that there are no nodes with degree 2. This means that no single toggle can reduce concurrency on this network but some can increase it. This, in turn, leads the MPLE's logistic regression to be separable, and it returns an initial parameter estimate for -20 or so for concurrency. MCMLE is not able to recover from that.
In fact, I wonder if we end up with a bimodal degree distribution, which makes it very difficult to sample from.
The [50,5] case is an artefact of MPLE. Here's the degree distribution of a network that
san()
generates:degree0 degree1 degree4 degree5 degree8 20 75 3 1 1
Notice that there are no nodes with degree 2. This means that no single toggle can reduce concurrency on this network but some can increase it. This, in turn, leads the MPLE's logistic regression to be separable, and it returns an initial parameter estimate for -20 or so for concurrency. MCMLE is not able to recover from that.
Very interesting. It raises a separate issue from Sam's point, that the estimation doesn't behave the way it used to. And that point is still important. Stability, esp. for these really basic models, is something we need to strive for.
But the other thing this highlights is one of the odd properties of this specification. For this term, Deg(3) is a threshold, and once you're over it, the term no longer has any effect. While there is some social/psychological logic to this behavior -- once you've crossed over to the world of concurrency, the monogamy norms no longer constrain your behavior -- even that's not really consistent, as the norms still operate on Deg(2).
If we want a different, more logical behavior, I think we'll need to define a different term. But that belongs in a different issue...
Interestingly, even when I get a reasonable-looking starting value with CD, it still doesn't converge. (What it does do is bounce around never quite getting an error.
I spy with my little eye... a phase transition!
library(ergm) # either dev or master; maybe CRAN
nw <- network.initialize(100, directed = FALSE)
mleish <- c( -1.631266, -5.502178 ) # This is as close as I've been able to get to what I had thought was the MLE.
set.seed(0)
sim <- simulate(nw~edges+concurrent, coef=mleish, nsim=100000, simplify=FALSE, output="stats")
plot(sim)
At first, it's spot-on. A little low, in fact:
summary(window(sim, end=4.75e7))
Mean SD Naive SE Time-series SE
edges 49.843 6.526 0.03031 0.1051
concurrent 4.952 2.698 0.01253 0.0430
But then:
summary(window(sim, start=4.8e7))
Mean SD Naive SE Time-series SE
edges 810.21 26.01260 0.1128415 0.205183
concurrent 99.99 0.08654 0.0003754 0.001857
I did not expect that from this model---since models with bounded change statistics tend to be well-behaved---and I still wonder if there is some other region in the parameter space that might behave better while having the correct expectation. It is also possible, in theory, that there is some deep bug. But, provisionally, at least, I am going to have to ask @smjenness to double-check whether this model and target combination ever fit well.
Yes, perhaps this is not a great example in general to be exploring concurrency with the mean degree of 1. I ran a bunch of combinations with a mean degree of 0.8 (edges = 40) instead and there was no problem fitting these. I think we can close this issue, and we'll just make sure to update any tutorials we have that might use the n=100/edges=50 combo to demonstrate concurrency. Thanks @krivit!
In addition to Ian's notes about LOLOG models, it is worth noting this his "Tapered ERGM" models work on this example. They produce fits with a narrow, but credible variation. Given the extreme nature of (50,5) this does seem ok. Explicitly:
library(ergm.tapered)
nw <- network.initialize(100, directed = FALSE)
fit <- ergm.tapered(nw ~ edges + concurrent, target.stats=c(50,5),
control=control.ergm(MCMC.interval=100000))
summary(fit)
mcmc.diagnostics(fit)
#
summary(fit$newnetwork ~ edges + concurrent)
#
sim <- simulate(fit, nsim=2000, simplify=FALSE, output="stats",
control=control.simulate.ergm(MCMC.interval=20000, MCMC.burnin=100000))
plot(sim)
head(sim)
apply(as.matrix(sim),2,median)
apply(as.matrix(sim),2,mean)
apply(as.matrix(sim),2,var)
produces the attached output
I think that this is the best we can do in ergm
for the moment. I've added the non-phase-transitioning test case to statnet/ergm.bench
, so that we can keep working it as a part of general computational improvements.
Using both the CRAN version and dev version of
ergm
, this fit:errors after 9 iterations with:
Also, a fit with a low but non-zero target stat for concurrent does not mix:
We have fit these models all the time in the past without error. They are a primary feature of our modeling classes (NME) to demonstrate the impact of concurrency on epidemic dynamics. This would be a critical issue to fix.
cc: @martinamorris, @sgoodreau