statnet / network

Classes for Relational Data
Other
15 stars 8 forks source link

inconsistent output classes for mixingmatrix #32

Closed martinamorris closed 3 years ago

martinamorris commented 4 years ago
d> class(mixingmatrix(fit_40$network, "hiv")) # a network object
[1] "mixingmatrix"

d> class(mixingmatrix(main, "hiv")) # an egodata object
[1] "matrix" "array"

and the structure

str(networkmixmat)
List of 2
 $ type  : chr "undirected"
 $ matrix: 'table' num [1:2, 1:2] 1550 258 258 16
  ..- attr(*, "dimnames")=List of 2
  .. ..$ From: chr [1:2] "0" "1"
  .. ..$ To  : chr [1:2] "0" "1"
 - attr(*, "class")= chr "mixingmatrix"

str(egodatamixmat)
num [1:2, 1:2] 212.8 12.4 24.7 12.7
 - attr(*, "dimnames")=List of 2
  ..$ ego  : chr [1:2] "0" "1"
  ..$ alter: chr [1:2] "0" "1"

not sure what the rationale is for these different structures and properties.

assigning @krivit just to get his take.

krivit commented 4 years ago

Now that I think about it, the class we should be using here is table, since this is a contingency table (sort of). This will also automagically make as.data.frame() and plot() methods work for mixingmatrix results.

krivit commented 4 years ago

On further thought, it looks like print.mixingmatrix also prints things like marginal totals, which depend on whether the network is unipartite or bipartite. (For unipartite networks, the off-diagonal elements effectively have to count for half, I believe.)

krivit commented 4 years ago

@CarterButts , what do you think?

mbojan commented 4 years ago

At this moment I see:

library(network)
data(emon)
mm <- mixingmatrix(emon$Texas, "Location")
str(mm)
#> List of 2
#>  $ type  : chr "directed"
#>  $ matrix: 'table' num [1:3, 1:3] 18 35 5 38 76 1 9 3 1
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ From: chr [1:3] "B" "L" "NL"
#>   .. ..$ To  : chr [1:3] "B" "L" "NL"
#>  - attr(*, "class")= chr "mixingmatrix"

So a list, which means we can't use [ and other matrix methods out of the box, only mm$matrix[1,2] and so on. My suggestion would be to store $type as an attribute not as a "slot" and have the class be, as @krivit suggested, c("mixingmatrix", "table").

Depending on our S3-zeal we could alternatively have separate S3 classes for directed, undirected, bipartite networks and so on to have a hierarchy such as c("mixingmatrix.directed", "mixingmatrix", "table"), which might spare some if()-ing or switch()-ing in print methods and some other unforeseen places.

krivit commented 3 years ago

I think this is a good direction to move in. @CarterButts , what do you think?

mbojan commented 3 years ago

I'm playing with these ideas at https://github.com/mbojan/network/tree/i32-mixingmatrix and have the following prototype:

data(flo, package="network")
net <- as.network(flo, directed=FALSE)
set.seed(666)
net %v% "a" <- sample(c(1,2,NA), network.size(net), replace=TRUE)
mm <- mixingmatrix(net, "a")
mm
#   1 2
# 1 1 4
# 2 4 3
# Note:  Marginal totals can be misleading for undirected mixing matrices.
str(mm)
#  'mixingmatrix' int [1:2, 1:2] 1 4 4 3
#  - attr(*, "dimnames")=List of 2
#   ..$ From: chr [1:2] "1" "2"
#   ..$ To  : chr [1:2] "1" "2"
#  - attr(*, "directed")= logi FALSE
#  - attr(*, "bipartite")= logi FALSE

dinet <- as.network(flo, directed=TRUE)
set.seed(666)
dinet %v% "a" <- sample(c(1,2,NA), network.size(dinet), replace=TRUE)
mm <- mixingmatrix(dinet, "a")
mm
#      To
# From   1  2 Sum
#   1    2  4   6
#   2    4  6  10
#   Sum  6 10  16
str(mm)
#  'mixingmatrix' int [1:2, 1:2] 2 4 4 6
#  - attr(*, "dimnames")=List of 2
#   ..$ From: chr [1:2] "1" "2"
#   ..$ To  : chr [1:2] "1" "2"
#  - attr(*, "directed")= logi TRUE
#  - attr(*, "bipartite")= logi FALSE

In short:

krivit commented 3 years ago

I like these changes. @CarterButts , what do you think?

One problem is that there are packages that may rely on the old output format. For transitioning, we would need to first update them to handle either format and only then release network, but what some of their development versions now need the new network.

One option is to implement a method [[.mixingmatrix that will fake the old behaviour with a deprecation warning and remove it once the transition is complete.

mbojan commented 3 years ago

One problem is that there are packages that may rely on the old output format. For transitioning, we would need to first update them to handle either format and only then release network, but what some of their development versions now need the new network.

I'll look at reverse-dependencies. If I find that anyone uses mixing matrix temporary faking $ and [[ methods sounds good.

mbojan commented 3 years ago

I'll look at reverse-dependencies. If I find that anyone uses mixing matrix temporary faking $ and [[ methods sounds good.

I drafted the $ and [[ methods. (not unmotivated by revdepchecks that had ETE of 9hrs even with 6 parallel workers...). I still will add some informative .Deprecated messages.