Dissolution w/ ergm.ego vs target stats approach

EmilyPo commented 4 years ago

@martinamorris @sgoodreau tagging MM & SMG because they will be interested

following up on the wacky decreasing edges issue that we have been discussing and @dth2 was helping me with yesterday. this may be related to the first issue I opened earlier this week but I'm separating this out for the time being.

note that at the moment in the other issue there is a question about appropriate pseudopopulation size what could be contributing to this problem, and I am in the process of working through Pavel and Martina's suggestions on that front.

Problem here: starting from NSFG egodata, I used ergm.ego and the ee.netest conversion function to fit the network and convert it to the netest format expected by EpiModel. When I run netdx using what Steve believes to be a reasonable exit rate (0.000641, or 1 / (the length of the age range (30)) * 52 time steps per year), the number of edges in my model rapidly drops from the target ~ 9800 to ~4000.

This behavior increases in severity when I increase the exit rate (to account for both aging out and mortality) and is almost eliminated when I set the exit rate to 0.

To compare, I also fit a network using the target stats approach with the same population size as the above model with the same edges target statistic, and the originally used exit rate (0.000641). The dynamic netdx on this model performs well.

netdx results from netest from ergm.ego object:

netdx results from target stats approach (but same size, target stat, and dissolution):

martinamorris commented 4 years ago

@EmilyPo -- so @dth2 wasn't able to find the problem? There must be a simple error because we have used the NSFG data for analysis with ergm.ego, and have never seen a problem like this. We're currently using ergm.ego for ARTnet and also not having this problem.

dth2 commented 4 years ago

Just to clarify, and I just sent Emily an email to this effect, the exit rate is not supposed to be part of the mortality rate passed to dissolution_coefs because it is already known to be 1 at the end of the age range. It is just supposed to be the mean mortality within the active age range. Once you fix this everything should be fine :)

dth2 commented 4 years ago

Can you post the code you used for the target stat approach? I am surprised it is fitting with the high mortality rate.

sgoodreau commented 4 years ago

Aha!! That explains why things got worse when we went from the default d.rate to the one that incorporated the inherent departure rate in terms of number of time steps. So tis is true anytime one uses ergm.ergo? Or some other subset of cases?

dth2 commented 4 years ago

It should not have anything to do with ergm.ego. We calculate the mortality rate the same way when using target stats to construct the formation and dissolution objects, so I am really surprised that the problem did not persist in Emily's target stats example.

sgoodreau commented 4 years ago

Presumably, then, there must be code somewhere in at least some branches of EpiModelHIV that calculates this automatically from the number of time steps implied by the age range. That wasn't there when I was running models, and it presumably isn't in Emily's code. We should determine then what versions of things it is present in and which not. If it's in all of EpiModelHIV that's one thing; but if it's just in some branches (e.g. the ones you wrote) then we want to get consistency. And also document it somewhere!

dth2 commented 4 years ago

It has been the case for everything I have done at least going back to the first version of EpiModelHIV.

https://github.com/statnet/EpiModelHIV/blob/ff1f9ef432e2b582e58c03c2777a9d8762955a51/R/estimation.R#L53

But I do see that that in the documentation for dissolution_coefs the d.rate is defined as Departure or exit rate from the population, as a single homogenous rate that applies to the entire population. However, I have never seen it applied that way. It has always been just the mortality rate within the active age range. This is a basic component of the setup functions: https://github.com/statnet/EpiModelHIV/blob/ff1f9ef432e2b582e58c03c2777a9d8762955a51/R/estimation.R#L186

Perhaps Sam can clarify.

EmilyPo commented 4 years ago

Ah, ok, so we had some confusion about what the d.rate in the dissolution_coefs was supposed to represent.

Just ran another dynamic diagnostic for the marriage/cohab network, using the ppopize as the nsfg sample size (43303), and the mean mortality rate for my population (not counting aging out) and it's looking much better, although it's about 2% low. I can try increasing the MCMC interval in the netdx (but it took awhile to run the first time so wanted to give an update first).

Screen Shot 2020-01-29 at 4 05 33 PM Screen Shot 2020-01-29 at 4 07 33 PM Screen Shot 2020-01-29 at 4 06 08 PM

EmilyPo commented 4 years ago

and @dth2 , here is the code I used to build the target.stats network. Let me know if something looks weird to you?

nsfg sample size

networksize <- eeo$ppopsize #(43303)

initialize same size network

net <- network.initialize(networksize, directed = FALSE)

target stat for edges

edges.target <- est$target.stats #(9868.161)

dissolution - use dissolution_marcoh

time.step <- 7 dissolution <- ~offset(edges) exitRate <- 0.000641 # 1/(3052) rounded duration.marcoh <- 8.914 (365/time.step)

dissolution_marcoh <- dissolution_coefs( dissolution = dissolution, duration = duration.marcoh, d.rate = exitRate)

netest

fit <- netest(net, formation = ~edges, target.stats = edges.target, coef.diss = dissolution_marcoh) summary(fit)

static diagnostic

target.static <- netdx(fit, nsims=1000, dynamic=F) target.static plot(target.static)

dynamic diagnostic

target.dynamic <- netdx(fit, nsteps=10000, nsims=5)

target.dynamic plot(target.dynamic) plot(target.dynamic, type = "duration")

martinamorris commented 4 years ago

i don't know what you're using for the ergm.ego runs, but best not to round things like the exitRate (just give it the expression 1/(30*52))

martinamorris commented 4 years ago

Here is the dissolution coef construction code that Adam & Zoe have been using for ARTnet: https://github.com/statnet/WHAMP/blob/51372771ea9b906d998351787c7dec25997cdb15/adams_egodx_darc/ergm_ego_fit.R#L135-L167

And here is the code that JKB used for SHAMP: https://github.com/statnet/SHAMP/blob/f17956bf4a04d1d5f603343af32f7473a44485d7/egonet/analyses/ergms/netdx_april2019/cohab_static_b1i1e6.Rmd#L121-L213

EmilyPo commented 4 years ago

I think I figured it out. Yes, the problem is related to the adjusted dissolution coefficient -- in how the ee.netest function adjusts the edges coefficient from the ergm.ego object to account for dissolution.

Netdx uses the crude dissolution coef to simulate (which does NOT account for mortality).

But ee.netest adjusts the original formation coef by the adjusted dissolution coef, which does account for mortality. So there is a mismatch there. If after the ee.netest conversion, you alter the formation coef to what it should be (original edges coef - crude dissolution coef, the coef.form), the netdx looks good.

Although I think the formation coef that accounts for mortality is the one we want to simulate with in EpiModel...

See below: Screen Shot 2020-02-04 at 10 00 35 AM

dth2 commented 4 years ago

We should talk about this today in NGM. NetDX and simulate both simulate the first network from the crude coefs, but after the initial time step tergmLite uses the adjusted coefs. In either case the edges coef should be adjusted for popsize in ee.netest.

martinamorris commented 4 years ago

Great. Let's discuss at NMG today :)

sgoodreau commented 4 years ago

Agreed - I haven't had time to keep abreast of all this before now but would love to go through it in detail in NMG.

EmilyPo / Diss-Duration