statnet / ergm.ego

Fit, Simulate and Diagnose Exponential-Family Random Graph Models to Egocentrically Sampled Network Data https://statnet.org
Other
14 stars 4 forks source link

Can ergm.ego handle target.stats? #41

Open martinamorris opened 4 years ago

martinamorris commented 4 years ago

Context: we want to use a mix of observed netstats and target.stats for an ergm.ego model @dth2 @krivit

The ... in the ergm.ego call specs suggests we can pass additional arguments to ergm, but I get an error when trying to do so:

# first see what happens with ergm
library(ergm)
data("faux.mesa.high")
# first look at stats
summary(faux.mesa.high ~ edges + nodematch("Race"))
         edges nodematch.Race 
           203            103 
# new target.stats for fit
fit <- ergm(faux.mesa.high ~ edges + nodematch("Race"), target.stats=c(203,50))
# works as expected
> summary(fit$newnetwork ~ edges + nodematch("Race"))
         edges nodematch.Race 
           203             50 
# now try with ergm.ego, which generates an error
library(ergm.ego)
> fmh.ego <- as.egodata(faux.mesa.high)
> fit.ego <- ergm.ego(fmh.ego ~ edges + nodematch("Race"), target.stats=c(203,50))
Constructing pseudopopulation network.
Error in ergm(ergm.formula, target.stats = m, offset.coef = ergm.offset.coef,  : 
  formal argument "target.stats" matched by multiple actual arguments
In addition: Warning message:
`set_attrs()` is deprecated as of rlang 0.3.0
This warning is displayed once per session. 

Makes sense, since ergm.ego uses the existing target.stats argument for it's own purposes. So it looks like we will need a new argument to pass target stats explicitly.

Request: would be nice to have a way to just specify the additional target stats, the ones we want to send explicitly, rather than a whole vector with the observed + the additional.

krivit commented 4 years ago

This should be pretty straightforward. How about a target.stats= argument for ergm.ego() that's interpreted as follows?

The default is, of course, a vector of NAs, so all statistics are calculated from data.

One complication is that the user can't always know in advance how big the pseudopopulation network is going to be. I think that we can rescale them as needed, but I am not 100% sure. It does raise a UI question: should the target.stats= argument expect per-capita statistics, population network statistics, or pseudopopulation network statistics? (My current inclination is per-capita.)

Lastly, another thing we might need to worry about is quantifying uncertainty. Does the statistic passed to ergm.ego come with a variance? How can we know its covariance with the other statistics? Off the top of my head, we might want to have a second argument, say, target.stats.cov=, which either a vector of the same length as target.stats that gives their variance (assuming uncorrelatedness) or a matrix with appropriate dimension that can also provide covariances. (The uncorrelatedness might not be as implausible as one might think: if the target stat has been estimated from a different dataset, then it will, in fact, be uncorrelated.)

martinamorris commented 4 years ago

Oh my. I'd forgotten about the inference issues. But backing up...

  1. for complex models, the prospect of supplying all of the target stats (even as a vector of 50 NA's followed by a single statistic) is not very appealing. so if it is possible to develop some new arguments ... One option: target stats are only specified for targetted terms, as follows:
ergm.ego(nw ~ terms.with.no.targets, 
                 targets ~ terms.with targets,
                 target.stats = c(per capita targets, one for each term above),
                 ...)

or, following the offset template:

ergm.ego(nw ~ terms.with.no.targets + target(term) + ...,
                 target.stats = c(per capita targets, one for each term above),
                  ...)
  1. I'm not sure how to think about the variance issue. That may be best started as a verbal conversation.
krivit commented 4 years ago

Hard to say. Right now, a vector with an element for each coefficient is what ergm() expects in the first place.

The target() decorator would require changes to ergm_model(), though nothing too complicated. It does raise the question of whether it should be etamapped like the rest of the model, and why.