Closed JosiahParry closed 1 month ago
There is the storage type and there is the semantic type (a combination of interface and semantics). rlang is about the storage type, vctrs is about semantics.
We've decided that S3 subclasses must explicitly inherit from a base vector/list class to be considered as such, even if they have vector/list storage. For instance, in the vctrs worldview an S3 model is a scalar and not a list, even though it has list storage.
FWIW is.vector()
is incredibly low level and is probably not a good thing to consider in this conversation:
is.vector(x) returns TRUE if x is a vector of the specified mode having no attributes other than names.
?vctrs::obj_is_list()
does a good job explaining the 2 rules that allow an object to be treated as a list in vctrs, x
is a list if:
x
is a bare list with no class.x
is a list explicitly inheriting from "list".As Lionel said, this distinction allows us to say that output from lm()
is considered a scalar object rather than a vector-like list object. Because its class is just "lm"
.
But a vctrs::list_of()
is considered a vector-like list, because its class structure is c("vctrs_list_of", "vctrs_vctr", "list")
This rule about what an explicit "list"
class means runs very deep. If you have a "list"
class on your object, we are going to try and index into it with VECTOR_ELT()
or VECTOR_PTR_RO()
at the C level, so it sure better be backed by a VECSXP
.
?vctrs::obj_is_vector()
similarly does a good job of describing what makes an object a vector in vctrs
https://vctrs.r-lib.org/reference/vector-checks.html#vectors-and-scalars
In particular, a good example here is the vctrs_rcrd
type.
n
vectors of equal sizec("vctrs_rcrd", "vctrs_vctr")
vec_proxy()
method that returns a data frameAnother good example are the Duration
and Interval
and Period
S4 classes from lubridate:
n
vectors of equal size"Period"
vec_proxy()
method that returns a data frameThank you all for the very clear and thoughtful responses! Following the details section in Vector Checks (which should be more discoverable, imo its really great writing!) this issue can be addressed by simply adding a new vec_proxy()
method.
Overall what I take away is that the comparison between rlang and vctrs should be between the _bare_
functions in rlang
. vctrs permits vector "status" to be obtained through other s3 generic methods (notably vec_proxy()
).
library(spdep)
library(vctrs)
# create listw object
nb <- cell2nb(10, 10)
listw <- nb2listw(nb)
# these tests should be the same
rlang::is_bare_list(nb)
#> [1] FALSE
rlang::is_bare_vector(nb)
#> [1] FALSE
vctrs::obj_is_list(nb)
#> [1] FALSE
vctrs::obj_is_vector(nb)
#> [1] FALSE
# tell {vctrs} that nb _is_ a vector
vec_proxy.nb <- function(x, ...) {
unclass(x)
}
# do these tests with {vctrs} again and see it is now vector
# but still not list
vctrs::obj_is_list(nb)
#> [1] FALSE
vctrs::obj_is_vector(nb)
#> [1] TRUE
# give a format method for the record
format.swm_rcrd <- function(x, ...) {
nbs <- field(x, "neighbours")
card <- spdep::card(nbs)
out <- paste("(", vapply(nbs, toString, character(1)), ")", sep = "")
out[which(card == 0)] <- NA
out
}
# try and create a record
x <- new_rcrd(listw, class = "swm_rcrd")
head(x)
#> <swm_rcrd[6]>
#> [1] (2, 11) (1, 3, 12) (2, 4, 13) (3, 5, 14) (4, 6, 15) (5, 7, 16)
Created on 2024-10-23 with reprex v2.1.0
This is somewhat of a philosophical question but with real consequences—so I apologize for it winding and curving!
TL;DR
The
nb
class is a list with attributes and no explicitlist
class.This is how the following packages see it
Background
One thing that has been bothering me since 2021 is that the
nb
andlistw
classes from the spdep cannot be easily integrated into the tidyverse.The
nb
class object is a ragged array stored in a list. A list is a vector and thus can work with vctrs and the tidyverse in general. However, thenb
class object does not have thelist
class explicitly added. There is disagreement across base R, rlang, and vctrs about what constitutes a vector and a list.Motivation
The
rcrd
class fromvctrs
provides a nice opportunity to be able to embed thelistw
class into the tidyverse workflow in a much more seamless way than has been possible in the past.I am quite interested in thinking through how I can make spatial statistics more accessible to the R ecosystem and this is a big part of it. I have a package sfdep which provides tidyverse compatibility by way of partitioning these two component lists
neighbours
andweights
as two separate columns in a dataframe. Ideally, it would be one as it can become out of sync.Question
What constitutes a
list
and avector
in vctrs and should there be agreement betweenrlang
andvctrs
as to what this is?Additionally, do you all have guidance as how one can address this? FWIW, I am not the author or maintainer of
{spdep}
and adding thelist
subclass is out of question as demonstrated in https://github.com/r-spatial/spdep/issues/59.Reprex
Created on 2024-10-22 with reprex v2.1.0