Closed LTLA closed 3 years ago
Sounds good but does this work?
> ConstantArray(5:7, value=NA)
Error in new_DelayedArray(seed, Class = "ConstantMatrix") :
the supplied seed must have exactly 2 dimensions when the specified
class (ConstantMatrix) extends DelayedMatrix
Seems like some unit tests are needed.
... only for matrices, apparently. Oops. Fixed.
Thanks. A few more things:
ConstantMatrix needs to contain ConstantArray so the following returns TRUE:
> is(ConstantArray(5:6, value=NA), "ConstantArray")
[1] FALSE
Another benefit of doing this is that ConstantMatrix will then inherit the ConstantArray reprresentation so you don't need to specify it again in the ConstantMatrix class definition. It will just be:
setClass("ConstantMatrix", contains=c("ConstantArray", "DelayedMatrix"))
Unfortunately, after doing this, you need to define coercion methods from ConstantMatrix to ConstantArray and from ConstantArray to ConstantMatrix to prevent some bad things to happen. See how these coercions are defined for RleArray/RleMatrix in RleArray-class.R
and do the same thing.
Validity method: Use setValidity2
instead of setValidity
. Use msg <- validate_dim_slot(object, "dim")
to validate the dim
slot (it will also make sure that the slot doesn't contain NA
s). Don't try to return all the problems in msg
but return only the first one. See validity method for SparseArraySeed for an example.
In addition to all the atomic types, the DelayedArray framework also supports arrays of type list
so this would need to be supported:
ConstantArray(4:3, value=list(55))
ConstantArray(4:3, value=list(letters))
I suggest you replace value="ANY"
with value="vector"
in the class definition (that's how the nzdata
slot is specified for SparseArraySeed). Then just remove the !is.atomic(object@value)
test in the validity method and you should be good.
The extract_array()
method should just return array(x@value, get_Nindex_lengths(index, dim(x)))
. Your .get_constant_dim
helper is not needed.
What's considered a zero depends on the type of a DelayedArray object. For example, if the type is character
, the zero value is the empty string. If the type is list
, it's NULL
. If you use:
zero <- vector(type(x), length=1L)
identical(x@value, zero)
in your is_sparse()
method then it should do the right thing whatever the type. (However I just discovered that dense2sparse()
is broken on arrays of type list
so avoid this in your tests.)
Export the ConstantArraySeed()
constructor function, especially since you document it.
Done... I think I got all of the above.
Thanks. Sorry to insist but you don't need to specify the representation in the class definition of ConstantMatrix 'cause you inherit it from ConstantArray (hey, I gave you above the exact setClass
statement to use for this class). Other than that, everything looks good.
ooops.
Just for fun, here's one way to make a delayed constant matrix without using the new ConstantMatrix stuff:
out2 <- DelayedArray(matrix(NA_real_))[rep(1, 1e6), rep(1, 1e6)]
However, it's 10x slower than:
out <- ConstantArray(c(1e6, 1e6), value=NA_real_)
and also the memory footprint of the resulting object is 5000x bigger.
Nothing too fancy here, this class just represents an array with a constant value for all of its cells.
This was originally implemented as part of SingleCellExperiment, which contains a primitive implementation of what I hope will be the
combineRows
implementation forSummarizedExperiment
objects. The idea is to create all-NA
matrices for assays that are not present in some SE objects, thus allowing them to be efficiently combined (without pretending that they're zero, which they're not).I thought about using an
RleArray
for this purpose but it breaks pretty quickly with:I guess I could make a list of
Rle
objects, but it's a bit of a chore to have to create the chunks manually. Probably like:By comparison, the proposed class is simple to understand and use (and maintain) by just calling:
Data extraction also seems faster based on some cursory timings, but this isn't a major consideration here.