Open CarterButts opened 3 years ago
(for this and https://github.com/statnet/network/issues/65)
I pointed out that using S3 dispatch would be a breaking change when it was requested that I use that instead of the function I originally proposed (network_from_data_frame()
).
https://github.com/statnet/network/pull/20#issuecomment-564830962
Ok, so it sounds like we have an issue to address. IIUC, using the S3 dispatch for this function is what causes the breakage. I believe this choice was originally motivated by maintainability concerns. @CarterButts do you have a preferred solution?
I have mixed feelings about this. To me, a data frame is not a generalisation of a matrix or an array, though for bipartite networks, it's a bit less clear-cut.
That having been said, if bipartite=TRUE
, and the matrix looks like an adjacency matrix, it makes more sense for as.network.data.frame()
to interpret the way as.network.matrix()
does: that rows are actors and columns are events. From what I understand, it currently interprets it as the "expanded bipartite" representation, in which both rows and columns contain both actors and events, and actor-actor and event-event blocks are fixed at 0.
I think this would fix @CarterButts's issue. @knapply, is there any reason not to change the bipartite=TRUE
handling of adjacency data frames to be consistent with the matrix method?
The input shouldn't be a data frame if it's supposed to be a matrix.
The errors could probably be more informative ("is this supposed to be an adjacency matrix? If so, use as.matrix()
first."), but this is not a bug -- it's user error.
If memory serves, the reason this is an issue is because the original as.network()
default skipped S3 dispatch and called as.network.matrix()
directly instead of attempting to coerce the input to a matrix first. Something like as.network(as.matrix(x))
.
I'm assuming this normalized the behavior of passing data frames as input that really should've been matrices.
@knapply, perhaps I misremembered. Does as.network.data.frame()
always treat the input data frame as an edge list of some type?
Has anything been changed in the as.network
command? I can no longer read in all my empirical data after a statnet update. It used to work fine. Can´t find the error. Thank you guys!
> WissOperativeAnpass1 <- read_excel("WissOperativeAnpass.xlsx")
> NetWissOperativeAnpass1 <- as.network(WissOperativeAnpass1)
Error: `loops` is `FALSE`, but `x` contains loops.
The following values are affected:
- `x[1, 1:2]`
- `x[2, 1:2]`
- `x[3, 1:2]`
- `x[4, 1:2]`
- `x[5, 1:2]`
- `x[6, 1:2]`
Also an issue here: https://community.rstudio.com/t/as-network-file-ergm-error-loops-is-false-but-x-contains-loops/115793
@jdohmen indeed, in the recent version of network the data.frame is interpreted as an edgelist (first two columns) plus optional edge attributes (the remaining columns, if any). In your case the data frame is a "two-mode" (non-square) adjacency matrix. What you need is convert it to R matrix with e.g. data.matrix()
, for example:
d <- data.frame(
+ a = c(0,0,1,1),
+ b = c(0,0,1,0),
+ c = c(1,1,0,0)
+ )
net <- as.network(data.matrix(d), bipartite = TRUE)
as.matrix(net)
# a b c
# 1 0 0 1
# 2 0 0 1
# 3 1 1 0
# 4 1 0 0
In your case it will be something like
WissOperativeAnpass1 <- read_excel("WissOperativeAnpass.xlsx")
NetWissOperativeAnpass1 <- as.network(data.matrix(WissOperativeAnpass1), bipartite = TRUE)
... assuming you have no other columns in Excel beyond the adjacency information.
@krivit @knapply @CarterButts , is it feasible to retain the original behavior by having an argument to as.network.data.frame()
for the case above (https://github.com/statnet/network/issues/64#issuecomment-1156361501). I'm thinking input = c("adjacency", "edgelist")
(and then match.arg()
internally) or simply adjacency = TRUE
(or FALSE
if edgelist)?
@krivit @knapply @CarterButts , is it feasible to retain the original behavior by having an argument to
as.network.data.frame()
for the case above (#64 (comment)). I'm thinkinginput = c("adjacency", "edgelist")
(and thenmatch.arg()
internally) or simplyadjacency = TRUE
(orFALSE
if edgelist)?
Then PLEASE also add a NODELIST (ego, alter1, alter2). Empirical survey data mostly comes as a NODELIST. I have spent so much time with getting nodelists into statnet:
@jdohmen I've made a separate issue #79 about such structured input. I believe this is so-called "adjacency list" (ego id and ids of it's "neighbors").
When called with a two-mode adjacency matrix,
as.network.matrix
will correctly interpret this as a graph with an enforced bipartition, with the passed matrix being the off-diagonal portion of the full adjacency matrix.as.network.data.frame
will not, and indeed returns errors if e.g. the matrix is square and has non-zero diagonal entries whenloops==FALSE
. (See also related issue ofas.network.data.frame
not respecting the same semantics asas.network.matrix
.) Settingloops=TRUE
andbipartite=TRUE
does not rectify the problem, because it throws an error when loops are set on bipartite graphs.For this issue, the needed fix is for
as.network.data.frame
to correctly detect and implement two-mode matrix processing. Here is a demonstration:We should be seeing the same behavior for
as.network.data.frame
asas.network.matrix
here, and are not.