statnet / ergm.multi

Fit, Simulate and Diagnose Exponential-Family Models for Multiple or Multilayer Networks
Other
14 stars 1 forks source link

Vertex attribute terms fail at extracting it #27

Open mbojan opened 4 months ago

mbojan commented 4 months ago

This is a exegesis of this question on SO:

library(ergm.multi)

data(florentine, package = "ergm")

plot(flobusiness)
plot(flomarriage)

flo <- Layer(business = flobusiness, marriage = flomarriage)
set.vertex.attribute(flo, attrname = "x", value = 1:32)

flo

# Works
summary(
  flo ~ L(~ edges, ~ business) + L(~ edges, ~ marriage)
)

# Works
ergm(
  flo ~ L(~ edges, ~ business) + L(~ edges, ~ marriage)
)

# Doesn't work
summary(
  flo ~ L(~ edges + nodecov("x"), ~ business) + 
    L(~ edges + nodecov("x"), ~ marriage)
)

# Error in `ergm_Init_abort()`:
# ! In term ‘nodecov’ in package ‘ergm’ (called from term ‘L’ in package ‘ergm.multi’): ‘x’ is/are not valid nodal attribute(s).
# Run `rlang::last_trace()` to see where the error occurred.

Brief debugging suggests that the term initialization function receives the network object stripped of all the attributes, hence x above is not found.

krivit commented 3 months ago

Not quite. The problem is that extracting subgraphs is extremely slow, so Layer() (and others) actually store the constituent networks (with edges removed) in a hidden attribute called .subnetcache. However, there is currently no mechanism to keep the cache up to date with the outer network, so if an attribute is created on the combined network, it won't show up when the network is "split" during initialisation.

For example, the following code works:

library(ergm.multi)

data(florentine, package = "ergm")

plot(flobusiness)
plot(flomarriage)
set.vertex.attribute(flobusiness, attrname = "x", value = 1:16)
set.vertex.attribute(flomarriage, attrname = "x", value = 1:16 + 16)

flo <- Layer(business = flobusiness, marriage = flomarriage)

flo

# Doesn't work
summary(
  flo ~ L(~ edges + nodecov("x"), ~ business) + 
    L(~ edges + nodecov("x"), ~ marriage)
)

I should probably do some subset of the following:

mbojan commented 3 months ago

Oh, OK. What makes the subgraph extraction a bottleneck?

krivit commented 3 months ago

network::get.inducedSubgraph() just isn't very fast, and if you have, say, 300 small networks together in a big one, it needs to be run 300 times for every term that requires splitting them up.