Closed smjenness closed 3 years ago
The argument names can be changed, but there is a difference in behavior.
Setting extract.summary.stats = TRUE
gives generative model statistics, and the option to obtain additional statistics via monitors
.
This does not appear to match the behavior of save.nwstats
and nwstats.formula
.
We can abandon separate handling of generative model statistics in tergmLite
; this will simplify things but cost efficiency when you want those statistics.
Or we can have save.nwstats
behave as extract.summary.stats
and nwstats.formula
default to ~.
when tergmLite = TRUE
in control.net
. This difference is not that complicated, but will need to be understood by the user.
Preference?
Currently, we also save tergmLite stats to summstats
rather than nwstats
, but it should be possible to change that as well.
Thanks for the clarification. We should have the same approach for both tergm and tergmLite, with the use of save.nwstats
to specify whether we want to calculate the stats, and nwstats.formula
to specify the ergm formula used for the stats we want.
The default, for both netdx
and netsim
, is "formation", which means the generative model statistics as I think you are using the term (right?). That should still be the case. Whether using tergm or tergmLite, when a user specifies save.nwstats = TRUE
it should default to the ergm formula used in netest. One can specify an alternative formula that may include some or all of the netest model formula.
I would like everything saved to dat$stats$nwstats
so that we can use get_nwstats
and plot.netsim
in the same way for tergm and tergmLite. I just implemented that change.
One wrinkle is that our current use of nwstats.formula
is the direct formula object, whereas you have monitors as a nested list of formulas. the latter makes more sense going forward for nwstats.formula
as we move towards supporting multi-layer networks in core EpiModel.
Neither "formation"
nor the formula used in netest
are, in general, the same as the generative model formula used in tergmLite
simulation.
That being the case, I will eliminate separate handling of the generative model statistics. This will make obtaining those statistics slower for tergmLite, but it sounds like we are willing to pay that price for greater consistency with the tergm
case.
The list()
for monitors
(and, earlier, MCMC_control
) was for the general EpiModelHIV case of multiple networks; it shouldn't be necessary for current core EpiModel, and I would just as soon use a single formula given the other changes that are being made now.
I do not know what you mean by "generative model statistics" then. Could you explain, perhaps with a specific example?
Sounds fine for using a single formula versus a list for nwstats.formula. Eventually, we hope to support multi-layer networks in core EpiModel, but that will be for later...
By the way, here is a good test case for the summary stats with tergmLite: http://statnet.org/nme/d4-s5-Serosort.html. Currently this uses full tergm to generate the epi sims because we want to track the nwstats after exogenous processes (epi dynamics) are imposed. It would be ideal to be able to just flip the tergmLite
switch to TRUE in the control settings and get the same outputs as in the tutorial without any other changes to the syntax. Here is the R script..
The generative model statistics for a tergm are those of the generative model formula, something like ~Form(~edges + nodefactor("age.grp") + nodefactor("race")) + Diss(~edges)
. The Form
statistics are evaluated on the union (formation) network and the Diss
statistics are evaluated on the intersection (dissolution) network. In general these are not the same as statistics evaluated on the instantaneous network (which is what you get with "formation"
or the cross-sectional ergm
formula used in netest
).
The generative model statistics are the ones neatly available from the ergm_model
we already have to prepare for simulation. If we want something else, it will mean another ergm_model
call; if that's what we want then that's what we'll get.
I see, so ~Form(~edges)
would be the number of new edges as at a time step, and so on?
Yes, we typically want to diagnose the model with its cross-sectional network representation, so we would need the stats evaluated on the instantaneous network. That's what netdx does and what netsim does for a full tergm approach.
cc: @martinamorris in case you have any opinions about this...
~Form(~edges)
would be the number of edges in either the current network or the previous timestep's network
For long duration models, differences between instantaneous and union/intersection statistics tend to be small (like 1/duration)
@smjenness if this is for netdx, then yes, we want the cross-sectional stats for each network in the timeseries, not the Form()
stats. @chad-klumb am i right that these could be obtained from the Cross()
operator?
FYI, we're putting together the TERGM tutorial now, and using this as an opportunity to write up a guide to migration from stergm
to tergm
. For now just know that the big move is from the two formation
and dissolution
equations to a set of 4 "operators" that can represent the dynamics in different ways, and can be used alone or in combination: Form(), Diss(), Cross(), & Change()
. The comparable pair in new tergm
are Form() & Diss()
; these produce a stergm
if all terms are inside the Form and Diss operators and there are no dyad-dependent sample space constraints.. Once you understand the other two operators, the whole TERGM modeling class makes much more sense.
no, this is for netsim not netdx. We want the tergmLite summary stats to match the full tergm summary stats (which have been implemented since the beginning of EpiModel)
Ok, sorry.
Looking at the script http://statnet.org/nme/d4-s5-Serosort.R what I see is you pulling out the cross-sectional stats you are monitoring from the simulation.
These are not the "generative" stats, or even the cross-sectional stats from the terms in the model, b/c you specified a different formula to monitor (nwstats.formula = ~edges + nodematch("status")
) than you did for the model in the fit/sim steps (formation = ~edges + nodefactor("status")
)
So we need to be more precise with terminology. "Summary stats" can mean different things in different contexts.
Is the goal to save the stats from this call: nwstats.formula = ~edges + nodematch("status")
, using the same syntax for both tergm
and tergmLite
?
Correct -- summary nwstats means cross-sectional network statistics that may be in the same form as the model, or may be different (usually model formula + some extra terms). We currently have that implemented for tergm, and want it implemented for tergmLite using the same syntax (please and thank you @chad-klumb!).
I think this and #540 are done. They are tested in tergmLite/tests/testthat/test-dynamic-simulation.R. If you have additional tests, now would be a good time to try them.
Just a note that I'll be adding support for nwstats.formula = NULL
(converting it to ~.
, which produces no statistics) shortly (at the moment, NULL
is an error).
NULL should work now
missed a summary call in net.mod.init, which I'm fixing now
actually, for consistency with the tergm case, I think that summary call shouldn't be there at all
will probably be removing it momentarily
actually, it's possible it should be there, since the tergm method sets up stats in resim_nets
as a special case when at == 2
will test a bit and compare results
I can report that this example case now works perfectly for both tergm and tergmLite: http://statnet.org/nme/d4-s5-Serosort.html. The only difference is simply toggling tergmLite TRUE/FALSE. Thanks!
I have tested this with a bunch of different scenarios and can't find anything wrong with the output. I will add some unit tests on this in EpiModel now, and then I think we are OK to merge into master. Anything else to address on your side @chad-klumb ?
I converted storage to a data.frame from a matrix (since the tergm method uses a data.frame).
Other than that there's nothing I'm aware of, but I will continue to test things between now and the release and open an issue on anything worth discussing.
It looks like there are a bunch of failing tests on tergmLite under test-updateModelTermInputs.R
. Are you addressing those?
https://github.com/statnet/tergmLite/runs/2775727596?check_suite_focus=true
I will wait until tergmLite passes until I merge EpiModel.
Those aren't failing on either of my windows laptops with the usual install stack
remotes::install_github(c("statnet/rle@master",
"statnet/statnet.common@master",
"statnet/network@master",
"statnet/networkDynamic@master",
"statnet/ergm@master",
"statnet/tergm@master",
"statnet/tergmLite@dev",
"statnet/EpiModel@dev"), dependencies = FALSE, force = TRUE)
I'll see if I can figure out what's going on...
Debugged it on the Rstudio servers. Should be fixed now.
great! I am starting the merge now...
control.net argument for tergmLite should match the same argument name for tergm