statnet / ergm.ego

Fit, Simulate and Diagnose Exponential-Family Random Graph Models to Egocentrically Sampled Network Data https://statnet.org
Other
14 stars 4 forks source link

Problems with ergm.ego simulation when fitted model has -Inf terms and ppopsize != popsize #79

Closed wtm-leung closed 1 year ago

wtm-leung commented 1 year ago

ergm.ego is unable to simulate from a fitted model when: a) both popsize and popsize are specified b) ppopsize != popsize c) the fitted model has -Inf coefficients (e.g. when using nodemix with a variable with some empty cells in the mixing matrix)

The GoF functions, however are working... Example below:

library(ergm.ego)
set.seed(0)
data(faux.mesa.high)
mesa.ego <- as.egor(faux.mesa.high)
mixingmatrix(mesa.ego, "Race") 
# note some cells with zero observations which result in -Inf coefficients in fitted model below

fit.1 <- ergm.ego(
  mesa.ego ~ edges + nodemix("Race", levels2=-8), # levels2 removes mix.Race.Hisp.Other with 1 observation
  control = control.ergm.ego(ppopsize = 250), popsize = 300)     # popsize < popsize e.g. when using weighted sample
summary(fit.1)
plot(gof(fit.1, GOF="model"))
sim.1 <- simulate(fit.1)

#> Note: Constructed network has size 205 different from requested 300. Simulated statistics may need to be rescaled.
#> Error in vector.namesmatch(target.stats, names(netsumm)) : 
#>  Name missmatch in "target.stats". Specify by position.  
mbojan commented 1 year ago

Thanks for reporting.

It seems that san() is getting a vector of SS which is broken:

trace(
   simulate.ergm.ego, 
   at = 8, 
   where = EgoStat.absdiff,
   tracer = quote({
     cat("Target statistics:\n")
     str(object$target.stats)
     cat("etamap:\n")
     str(object$etamap$offsettheta)
   }))
## Tracing function "simulate.ergm.ego" as seen from package "ergm.ego"
## [1] "simulate.ergm.ego"
simulate(fit.1)
## Note: Constructed network has size 205 different from requested 300. Simulated statistics may need to be rescaled.
## Tracing simulate.ergm.ego(fit.1) step 8 
## Target statistics:
##  Named num [1:15] 203 0 8 53 13 41 46 0 0 0 ...
##  - attr(*, "names")= chr [1:15] "edges" "mix.Race.Black.Black" "mix.Race.Black.Hisp" "mix.Race.Hisp.Hisp" ...
##  - attr(*, "na.action")= 'omit' Named int 1
##   ..- attr(*, "names")= chr "offset(netsize.adj)"
## etamap:
##  logi [1:16] TRUE FALSE TRUE FALSE FALSE FALSE ...
##  Error in vector.namesmatch(target.stats, names(netsumm)) : 
## Name missmatch in "target.stats". Specify by position.

In the next step vector of SS is indexed with etamap. As it is too long (16 rather than 15 elements), the result has NA, which trips names matching down the road. It seems it is not simulate()ing to blame but the model object...

Interestingly the following two do not trip:

fit.2 <- ergm.ego(
  mesa.ego ~ edges + nodemix("Race", levels2=-8),
  control = control.ergm.ego(ppopsize.mul = 2)
)
simulate(fit.2)

fit.3 <- ergm.ego(
  mesa.ego ~ edges + nodemix("Race", levels2=-8),
  popsize = nrow(mesa.ego$ego) * 2
)
simulate(fit.3)
krivit commented 1 year ago

The problem had to do with extreme target statistics causing inconsistencies in parameter vector lengths and such. In any case, the code in question was there for compatibility with ergm < 4, so I've removed it.

krivit commented 1 year ago

And thanks @wtm-leung for reporting and @mbojan for narrowing the problem down!

wtm-leung commented 11 months ago

Excellent, thank you for your help with this!