Vertex attribute terms fail at extracting it

mbojan commented 4 months ago

This is a exegesis of this question on SO:

library(ergm.multi)

data(florentine, package = "ergm")

plot(flobusiness)
plot(flomarriage)

flo <- Layer(business = flobusiness, marriage = flomarriage)
set.vertex.attribute(flo, attrname = "x", value = 1:32)

flo

# Works
summary(
  flo ~ L(~ edges, ~ business) + L(~ edges, ~ marriage)
)

# Works
ergm(
  flo ~ L(~ edges, ~ business) + L(~ edges, ~ marriage)
)

# Doesn't work
summary(
  flo ~ L(~ edges + nodecov("x"), ~ business) + 
    L(~ edges + nodecov("x"), ~ marriage)
)

# Error in `ergm_Init_abort()`:
# ! In term ‘nodecov’ in package ‘ergm’ (called from term ‘L’ in package ‘ergm.multi’): ‘x’ is/are not valid nodal attribute(s).
# Run `rlang::last_trace()` to see where the error occurred.

Brief debugging suggests that the term initialization function receives the network object stripped of all the attributes, hence x above is not found.

krivit commented 3 months ago

Not quite. The problem is that extracting subgraphs is extremely slow, so Layer() (and others) actually store the constituent networks (with edges removed) in a hidden attribute called .subnetcache. However, there is currently no mechanism to keep the cache up to date with the outer network, so if an attribute is created on the combined network, it won't show up when the network is "split" during initialisation.

For example, the following code works:

library(ergm.multi)

data(florentine, package = "ergm")

plot(flobusiness)
plot(flomarriage)
set.vertex.attribute(flobusiness, attrname = "x", value = 1:16)
set.vertex.attribute(flomarriage, attrname = "x", value = 1:16 + 16)

flo <- Layer(business = flobusiness, marriage = flomarriage)

flo

# Doesn't work
summary(
  flo ~ L(~ edges + nodecov("x"), ~ business) + 
    L(~ edges + nodecov("x"), ~ marriage)
)

I should probably do some subset of the following:

Update the documentation in Layer(), Networks(), and NetSeries(), to explain that the resulting data structure should be treated as read-only, at least with respect to vertex attributes.
Implement attribute-setting methods for the combined_netwoks class that will either refuse to set vertex attributes on a combined network or warn the user not to expect consistent behaviour.
Implement attribute-setting methods for the combined_netwoks class that will propagate attributes to the networks in the cache, as well as any subnetworks they might have.

mbojan commented 3 months ago

Oh, OK. What makes the subgraph extraction a bottleneck?

krivit commented 3 months ago

network::get.inducedSubgraph() just isn't very fast, and if you have, say, 300 small networks together in a big one, it needs to be run 300 times for every term that requires splitting them up.

statnet / ergm.multi

Vertex attribute terms fail at extracting it #27