statnet / ergm

Fit, Simulate and Diagnose Exponential-Family Models for Networks
Other
98 stars 37 forks source link

Is disabling shared partner cache (cache.sp=FALSE) option worth keeping around? #529

Open krivit opened 1 year ago

krivit commented 1 year ago

Shared partner cache keeps track for each dyad how many shared partners (of specified type) does each dyad have. It's implemented as a hash table that only stores nonzero values.

Right now, there is an option to disable it and recalculate from scratch every time (but also not have to keep track of every dyad). Whether or not this is faster depends on the network density; and also DSP and NSP terms benefit far more than ESP terms. (In the following, disregard the "cld" column: I am pretty sure there's a bug in microbenchmark that messes it up.)

library(ergm)
library(microbenchmark)
library(parallel)

ds <- seq(4, 16, by=4)

parallel::mclapply(ds, \(d) {
  nw0 <- network.initialize(71, TRUE)

  coef <- c(qlogis(d/70), 0)
  f <- nw0 ~ edges + gwesp(.5, fixed=TRUE)
  spcache <- control.simulate.formula(MCMC.burnin = 1e5, term.options = list(cache.sp = TRUE))
  nocache <- control.simulate.formula(MCMC.burnin = 1e5, term.options = list(cache.sp = FALSE))

  list(
    ESP = microbenchmark(spcache = simulate(f, coef=coef, control = spcache, output="stats", seed=0),
                         nocache = simulate(f, coef=coef, control = nocache, output="stats", seed=0)),

    DSP = microbenchmark(spcache = simulate(f, coef=coef, control = spcache, output="stats", seed=0),
                         nocache = simulate(f, coef=coef, control = nocache, output="stats", seed=0))
  )
}, mc.cores=4)
#> [[1]] ########## Mean degree = 4
#> [[1]]$ESP
#> Unit: milliseconds
#>     expr      min      lq     mean   median       uq       max neval cld
#>  spcache 83.22721 86.4028 89.89744 88.27861 91.66699 173.40730   100  a 
#>  nocache 58.26253 60.0700 62.48677 61.83015 64.52540  72.79393   100   b
#> 
#> [[1]]$DSP
#> Unit: milliseconds
#>     expr      min       lq     mean   median       uq       max neval cld
#>  spcache 86.22012 91.89318 94.31082 93.80364 95.93294 116.04843   100  a 
#>  nocache 58.02529 63.56184 65.97712 65.89219 67.61725  76.88128   100   b
#> 
#> 
#> [[2]] ########## Mean degree = 8
#> [[2]]$ESP
#> Unit: milliseconds
#>     expr      min       lq    mean   median       uq      max neval cld
#>  spcache 144.2233 148.3621 154.657 153.6429 160.4843 179.2000   100  a 
#>  nocache 119.1110 124.9785 129.536 128.3771 132.9777 202.0203   100   b
#> 
#> [[2]]$DSP
#> Unit: milliseconds
#>     expr      min       lq     mean   median       uq      max neval cld
#>  spcache 135.7565 141.5475 146.2515 145.9087 149.7074 169.1471   100  a 
#>  nocache 112.0146 115.2553 120.0299 118.8966 122.2159 156.4852   100   b
#> 
#> 
#> [[3]] ########## Mean degree = 12
#> [[3]]$ESP
#> Unit: milliseconds
#>     expr      min       lq     mean   median       uq      max neval cld
#>  spcache 158.8965 168.4683 174.6540 173.5448 180.8576 202.3430   100  a 
#>  nocache 223.5973 234.6594 244.6803 240.9177 251.9656 346.4577   100   b
#> 
#> [[3]]$DSP
#> Unit: milliseconds
#>     expr      min       lq     mean   median       uq      max neval cld
#>  spcache 149.4813 153.4188 156.7299 155.7920 159.0030 174.4172   100  a 
#>  nocache 212.8045 218.1543 222.7337 220.7894 225.7721 242.0761   100   b
#> 
#> 
#> [[4]] ########## Mean degree = 16
#> [[4]]$ESP
#> Unit: milliseconds
#>     expr      min       lq     mean   median       uq      max neval cld
#>  spcache 175.8137 183.9081 195.7087 196.9013 204.3772 222.0586   100  a 
#>  nocache 404.8568 430.0922 448.4105 446.7976 464.3663 544.1174   100   b
#> 
#> [[4]]$DSP
#> Unit: milliseconds
#>     expr      min       lq     mean   median       uq      max neval cld
#>  spcache 170.1248 172.4382 174.6645 173.9554 175.9052 191.1998   100  a 
#>  nocache 394.7653 398.7394 403.8018 400.2869 404.8243 429.1888   100   b

Created on 2023-05-15 with reprex v2.0.2

As I see it, the costs and benefits of keeping non-cached option around are as follows:

Any thoughts? @CarterButts , in particular, I think you might have some experience with triadic effects.