Is `blockdiagTNT` proposal actually needed?

krivit commented 5 months ago

Based on benchmarks, it seems that net of initialisation, RLEBDM-based TNT is at least as fast as block-diagonal TNT:

library(ergm.multi)
library(microbenchmark)
data(Goeyvaerts)
G <- Networks(Goeyvaerts)
TNT_setup <- simulate(G~edges, coef=0, control=control.simulate.formula(MCMC.prop="TNT"~sparse, MCMC.burnin=1000000), return.args = "ergm_state")
bdiagTNT_setup <- simulate(G~edges, coef=0, control=control.simulate.formula(MCMC.prop="blockdiagTNT"~sparse, MCMC.burnin=1000000), return.args = "ergm_state")
TNT_setup$control$MCMC.samplesize <- bdiagTNT_setup$control$MCMC.samplesize <- 1
microbenchmark(ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta=0),
               ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control, theta=0))
#> Unit: milliseconds
#>                                                                             expr
#>                 ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta = 0)
#>  ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control,      theta = 0)
#>       min       lq     mean   median       uq      max neval cld
#>  100.6385 101.7572 108.0130 105.0995 110.6982 137.6487   100  a 
#>  131.9064 134.0418 141.1563 138.9367 145.6203 173.8697   100   b
microbenchmark(ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta=-1.5),
               ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control, theta=-1.5))
#> Unit: milliseconds
#>                                                                                expr
#>                 ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta = -1.5)
#>  ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control,      theta = -1.5)
#>       min       lq     mean   median       uq      max neval cld
#>  102.8011 103.1020 105.0725 103.3235 104.7576 118.5839   100  a 
#>  134.3539 134.5741 135.7776 135.0053 135.9910 146.3863   100   b
microbenchmark(ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta=-Inf),
               ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control, theta=-Inf))
#> Unit: milliseconds
#>                                                                                expr
#>                 ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta = -Inf)
#>  ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control,      theta = -Inf)
#>        min        lq      mean    median        uq       max neval cld
#>   60.52997  61.45865  62.65144  61.83995  63.17748  70.94519   100  a 
#>  117.32734 118.00735 120.11300 118.70835 121.89950 130.59725   100   b

^{Created on 2024-02-20 with reprex v2.1.0}

library(ergm.multi)
library(microbenchmark)
data(Lazega)
L <- Layer(Lazega, c("advice", "coworker", "friendship"))

TNT_setup <- simulate(L~edges, coef=-1.568, control=control.simulate.formula(MCMC.prop="TNT"~sparse, MCMC.burnin=1000000), return.args = "ergm_state")
bdiagTNT_setup <- simulate(L~edges, coef=-1.568, control=control.simulate.formula(MCMC.prop="blockdiagTNT"~sparse, MCMC.burnin=1000000), return.args = "ergm_state")
TNT_setup$control$MCMC.samplesize <- bdiagTNT_setup$control$MCMC.samplesize <- 1
microbenchmark(ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta=0),
               ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control, theta=0))
#> Unit: milliseconds
#>                                                                             expr
#>                 ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta = 0)
#>  ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control,      theta = 0)
#>       min       lq     mean   median       uq      max neval cld
#>  152.2988 155.5460 165.9686 158.5501 167.4920 360.4515   100   a
#>  158.4937 161.2912 168.3508 164.6368 173.2115 218.8608   100   a
microbenchmark(ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta=-1.5),
               ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control, theta=-1.5))
#> Unit: milliseconds
#>                                                                                expr
#>                 ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta = -1.5)
#>  ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control,      theta = -1.5)
#>      min       lq     mean   median       uq      max neval cld
#>  142.528 144.2626 147.4608 146.0247 148.9191 164.6533   100  a 
#>  150.191 152.4383 155.7040 153.7846 157.3937 187.1000   100   b
microbenchmark(ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta=-Inf),
               ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control, theta=-Inf))
#> Unit: milliseconds
#>                                                                                expr
#>                 ergm_MCMC_sample(TNT_setup$object, TNT_setup$control, theta = -Inf)
#>  ergm_MCMC_sample(bdiagTNT_setup$object, bdiagTNT_setup$control,      theta = -Inf)
#>       min       lq     mean   median       uq      max neval cld
#>  60.37052 61.59323 63.56210 62.17093 63.97852 76.57572   100  a 
#>  73.22935 74.13945 76.33268 74.67377 76.77582 89.91813   100   b

^{Created on 2024-02-20 with reprex v2.1.0}

The dependence factors are about the same, at least if the blocks' densities are homogeneous:

library(ergm.multi)
library(coda)
data(Goeyvaerts)
G <- Networks(Goeyvaerts)

nsteps <- 1e7
interval <- 1e3

nsteps/effectiveSize(mcmc(simulate(G~edges, coef=0, nsim=nsteps/interval, output="stats", control=control.simulate.formula(MCMC.prop="TNT"~sparse, MCMC.burnin=1e7, MCMC.interval=interval))))
#>    edges 
#> 4191.982
nsteps/effectiveSize(mcmc(simulate(G~edges, coef=-1.5, nsim=nsteps/interval, output="stats", control=control.simulate.formula(MCMC.prop="TNT"~sparse, MCMC.burnin=1e7, MCMC.interval=interval))))
#>    edges 
#> 1717.493
nsteps/effectiveSize(mcmc(simulate(G~edges, coef=-3, nsim=nsteps/interval, output="stats", control=control.simulate.formula(MCMC.prop="TNT"~sparse, MCMC.burnin=1e7, MCMC.interval=interval))))
#>   edges 
#> 1045.17

nsteps/effectiveSize(mcmc(simulate(G~edges, coef=0, nsim=nsteps/interval, output="stats", control=control.simulate.formula(MCMC.prop="blockdiagTNT"~sparse, MCMC.burnin=1e7, MCMC.interval=interval))))
#>    edges 
#> 4344.953
nsteps/effectiveSize(mcmc(simulate(G~edges, coef=-1.5, nsim=nsteps/interval, output="stats", control=control.simulate.formula(MCMC.prop="blockdiagTNT"~sparse, MCMC.burnin=1e7, MCMC.interval=interval))))
#>    edges 
#> 1702.582
nsteps/effectiveSize(mcmc(simulate(G~edges, coef=-3, nsim=nsteps/interval, output="stats", control=control.simulate.formula(MCMC.prop="blockdiagTNT"~sparse, MCMC.burnin=1e7, MCMC.interval=interval))))
#>    edges 
#> 972.1547

^{Created on 2024-02-20 with reprex v2.1.0}

drh20drh20 commented 5 months ago

I was just thinking the other day about our ergm manuscript, part 2. Maybe it's a good idea to add material on RLEBDM-based TNT and finally ship it out to a more computational (as opposed to software-oriented) journal. There's no way I could work on this prior to mid-April but maybe after that point...

krivit commented 5 months ago

I'll do some quick tests on blocks with inhomogeneous densities, and if those don't show a benefit for blockdiagTNT, I might deprecate it or see if there are any optimisations to be made.

But yes, it would be good to get that paper out.

drh20drh20 commented 5 months ago

Agreed regarding the paper. But why deprecate blockdiagTNT? It seems presumptuous to think that a few tests mean that it's of no use in any situation.

krivit commented 5 months ago

Like everything else, it adds to compile time and has to be maintained. It also has to be prioritised against other proposals that implement the DyadGen API (and can thus handle any dyad-independent constraint).

Also, I've just checked, and both proposals TNT-propose both edges and dyads uniformly. That is, the probability that the selected edge/dyad is from a given block is the proportion out of total edges/dyads found in that block. There might be some benefit to selecting in proportion to some other function of block size, if that were to be implemented.

On the other hand, it looks like at the moment blockdiagTNT uses a bisection algorithm to select dyads at random, which is O(log(B)), whereas RLEBDM TNT uses rejection sampling, which is close to O(1) if the largest block isn't that much larger than the rest.

I suppose we could see if blockdiagTNT is faster with rejection sampling or Walker's Alias or similar.

statnet / ergm.multi

Is `blockdiagTNT` proposal actually needed? #26