Open Liubuntu opened 6 years ago
Hi Qian,
There are 2 ways to deal with this: (1) the quick-and-dirty way that would only let you wrap a List object in a 1-dimensional DelayedArray object, and (2) the more general solution that would let you wrap a List object in a DelayedArray object of arbitrary dimensions.
But first some background: Objects passed to DelayedArray()
need to comply with the "seed contract" which means that they need to have dimensions. List derivatives don't support dim()
in general (with the notable exception of ArrayGrid objects). And trying to set directly the "dim" attribute on them doesn't seem to work (for some reason, the methods package doesn't let us put attributes on S4 objects):
x <- rep(IntegerList(1:5, integer(0), NA, 3:-2), 3)
dim(x) <- c(6, 2)
# Error in dim(x) <- c(6, 2) : invalid first argument
Solution (1) only requires that you define a "dim"
and "extract_array"
method for List objects:
setMethod("dim", "List", function(x) length(x))
setMethod("extract_array", "List",
function(x, index)
{
x_dim <- dim(x)
ans_dim <- DelayedArray:::get_Nindex_lengths(index, x_dim)
i <- index[[1L]]
if (!is.null(i))
x <- x[i]
DelayedArray:::set_dim(as.list(x), ans_dim)
}
)
Now that List objects comply with the "seed contract", they can be passed to DelayedArray()
:
DelayedArray(x)
# <12> DelayedArray object of type "list":
# [1] [2] [3] .
# 1, 2, 3, 4, 5 NA .
# [11] [12]
# NA 3, 2, 1, 0, -1, -2
However, I'm not sure about the exact consequences of defining this "dim"
method for List objects but I don't have a good feeling about it. There is a lot of code around the place that relies on things like if (is.null(dim(x)))
in order to decide how to operate on an object. This is why I call this a quick-and-dirty solution.
Solution (2) is more general and much cleaner. It involves the following:
The easiest way to set arbitrary dimensions on a List object is to wrap the object in a thin wrapper that can hold the dim information. Something like this:
setClass("ListArraySeed",
contains="Array",
representation(
dim="integer",
L="List"
)
)
Note that we use composition here instead of inheritance, which is a key aspect of this solution and why it is cleaner and safer than solution (1).
seed <- new("ListArraySeed", dim=c(6L,2L), L=x)
dim(seed)
# [1] 6 2
dimnames(seed) # no dimnames for now but this would be easy to support
# NULL
Before we can pass this to DelayedArray()
, we need to define an "extract_array"
method:
### Will work if x@L supports linear (i.e. 1D-style) subsetting and as.list().
setMethod("extract_array", "ListArraySeed",
function(x, index)
{
x_dim <- dim(x)
ans_dim <- DelayedArray:::get_Nindex_lengths(index, x_dim)
i <- DelayedArray:::to_linear_index(index, x_dim)
ans <- as.list(x@L[i])
DelayedArray:::set_dim(ans, ans_dim)
}
)
Then:
DelayedArray(seed)
# <6 x 2> DelayedMatrix object of type "list":
# [,1] [,2]
# [1,] 1, 2, 3, 4, 5 NA
# [2,] 3, 2, 1, 0, -1, -2
# [3,] NA 1, 2, 3, 4, 5
# [4,] 3, 2, 1, 0, -1, -2
# [5,] 1, 2, 3, 4, 5 NA
# [6,] 3, 2, 1, 0, -1, -2
And finally, in the same fashion that we have the HDF5ArraySeed/HDF5Array/HDF5Matrix trio, we would need to complete this with the ListArray and ListMatrix classes (would extend DelayedArray and DelayedMatrix, respectively), and with the ListArray()
constructor. So you could just do something like:
M <- ListArray(x, dim=c(6, 2))
and this would return a ListMatrix instance which would degrade to a DelayedMatrix instance as soon as you start operating on it e.g. M[ , 1]
, t(M)
, etc...
As an extra convenience, the DelayedArray()
constructor could be modified to work directly on List object x
, in which case it would just call ListArray(x, dim=length(x))
.
So this is feasible, but will require some significant new developments. Adding this to the TODO list but don't think I'll be able to get to this before September...
H.
The 2nd solution looks good and robust. I guess we are not in rush of this, but will be a great feature to be added in DelayedArray. We can close the issue for now if you want, as you already have it in your TODO list. :)
Let's keep it open. My TODO list is virtual and opened issues are part of it ;-)
Hi @hpages ,
In the
VariantAnnotation
package, theCollapsedVCF
(orExpandedVCF
) are saving the data entries inIntegerList / CharacterList...
. And in the development ofVCFArray
, we are trying to represent these data entries asDelayedArray
instances. Now we are converting the data entries intoarray
to add dimension, and then use theDelayedArray
constructor over thearray
. Is it possible to have theDelayedArray
constructor directly work onList
object? so that the internal data saving are still using a more efficient way inList
structure than the ordinarylist
? @mtmorgan