Bioconductor / DelayedArray

A unified framework for working transparently with on-disk and in-memory array-like datasets
https://bioconductor.org/packages/DelayedArray
24 stars 9 forks source link

replacement method [<- support the same subassignment as regular array #32

Closed mikejiang closed 6 years ago

mikejiang commented 6 years ago

It is neither intuitive nor convenient for users to be required to always construct a DelayedArray index before they can update the data. Can this be done automatically behind the scene in [<- method so that the existing user code doesn't need the major changes to achieve partial write to DelayedArray?

> a <- array(0, dim=c(2, 2))
> A <- DelayedArray(a)
> a[1,1] <- 1
> a[4] <- 2
> a
     [,1] [,2]
[1,]    1    0
[2,]    0    2
> A[1,1] <- 1
Error in `[<-`(`*tmp*`, 1, 1, value = 1) : 
  subassignment to a DelayedArray object 'x' (i.e. 'x[i] <- value') is supported only when the
  subscript 'i' is a logical DelayedArray object with the same dimensions as 'x' and when 'value' is a
  scalar (i.e. an atomic vector of length 1)
> A[1] <- 1
Error in `[<-`(`*tmp*`, 1, value = 1) : 
  subassignment to a DelayedArray object 'x' (i.e. 'x[i] <- value') is supported only when the
  subscript 'i' is a logical DelayedArray object with the same dimensions as 'x' and when 'value' is a
  scalar (i.e. an atomic vector of length 1)
> idx <- DelayedArray(array(c(T,F,F,F), dim  = c(2,2)))
> A[idx] <- 1
> A
<2 x 2> DelayedMatrix object of type "double":
     [,1] [,2]
[1,]    1    0
[2,]    0    0
mikejiang commented 6 years ago

To support updating with another sub-matrix (regular matrix or another DelayedArray) is also important

> a <- array(0, dim=c(3, 3))
> b <- array(1, dim = c(2,2))
> a
     [,1] [,2] [,3]
[1,]    0    0    0
[2,]    0    0    0
[3,]    0    0    0
> b
     [,1] [,2]
[1,]    1    1
[2,]    1    1
> a[1:2, 1:2] <- b
> a
     [,1] [,2] [,3]
[1,]    1    1    0
[2,]    1    1    0
[3,]    0    0    0
hpages commented 6 years ago

Hi Mike,

Yep, subassignment to a DelayedArray object was very limited until now. Partly because nobody requested it and partly because I was not sure how to best support this. Turns out that the refactoring of DelayedArray internals I that I did in April this year makes it easier to fully support subassignment to a DelayedArray object (via addition of a new type of DelayedOp object). So here it is (DelayedArray 0.7.43).

Try it and let me know how it goes for you.

Note that you can use showtree(x) if you are curious to see the tree of delayed operations carried by a DelayedArray object:

library(DelayedArray)
M <- DelayedArray(matrix(1:20, nrow=5))

showtree(M)
# 5x4 integer: DelayedMatrix object
# └─ 5x4 integer: [seed] matrix object

M[2:1, ] <- M[4:5, ] + 55.5
M
# <5 x 4> DelayedMatrix object of type "double":
#      [,1] [,2] [,3] [,4]
# [1,] 60.5 65.5 70.5 75.5
# [2,] 59.5 64.5 69.5 74.5
# [3,]  3.0  8.0 13.0 18.0
# [4,]  4.0  9.0 14.0 19.0
# [5,]  5.0 10.0 15.0 20.0

showtree(M)
# 5x4 double: DelayedMatrix object
# └─ 5x4 double: Subassign
#    ├─ 5x4 integer: [seed] matrix object
#    └─ right value: 2x4 double: DelayedMatrix object
#                    └─ 2x4 double: Unary iso op stack
#                       └─ 2x4 integer: Subset
#                         └─ 5x4 integer: [seed] matrix object
hpages commented 6 years ago

Closing this. Please create a new issue if you run into problems with subassignment.