Closed phuang5 closed 1 year ago
Thanks for the detailed report! The problem is that whereas the gwesp()
implementation computes the quantity of interest directly, the dgwesp
implementation first computes the desp()
frequencies, then takes their weighted sum. Since most problems in practice involve sparse networks, the SP quantities for which it keeps track is capped by gw.cutoff
term option (see options?ergm
or help for any of these terms). This defaults to 30, and if a network distribution has higher shared partner counts than that, it will result in incorrect values.
You can set this cutoff globally. For example running options(ergm.term=list(gw.cutoff=100))
before running all of your code will result in matching answers.
All variants of gw*sp(fix=TRUE)
now use the full SP distribution regardless of cutoff. dg*sp(fix=FALSE)
and gwb*sp(fix=FALSE)
will immediately stop with an error if they ever encounter any configurations past the cutoff.
I notice that the calculation of
dgwesp
term is lower than what it should be for both "OTP" (outgoing two-paths) and "ITP" (incoming two-paths). Interestingly, thegwesp
term, when applied to digraph, calculates the dgwesp(OTP) correctly. The problem only occurs when the network size is large enough, and it always under-counts the statistics in my experiment, so I suspect there might be some overflow issues that only happen to thedgwesp
implementation.Please see below a toy example that demonstrates the problem. Basically the script obtains the census of
desp
(edgewise shared-partners), and calculates the dgwesp statistics by hand to compare it with results fromdgwesp
andgwesp
inergm.
Thanks!