Bioconductor / DelayedArray

A unified framework for working transparently with on-disk and in-memory array-like datasets
https://bioconductor.org/packages/DelayedArray
24 stars 9 forks source link

Serialized DelayedArray instances from release (BioC 3.6) are broken in devel (BioC 3.7) #17

Closed PeteHaitch closed 6 years ago

PeteHaitch commented 6 years ago

The recent internal changes to the DelayedArray class mean that DelayedArray instances serialized using the release version of BioC are broken when read in using the devel branch. Unfortunately, updateObject,DelayedArray-method is unable to fix them. A reproducible example is shown below.

Can updateObject,DelayedArray-method be fixed? If necessary, can updateObject,SummarizedExperiment-method please be similarly updated?

This was brought to my attention by @j-lawson whose MIRA package uses a serialized BSseq instance in tests (BSseq extends SummarizedExperiment); the object is available from https://github.com/databio/MIRA/blob/4eaaca95cbb52bac6dadcee45261c463d748c219/data/exampleBSseqObj.RData. The serialized instance was created by something like bsseq::BS.chr22[1:20, ], where bsseq::BS.chr22 is a BSseq object with DelayedMatrix instances as assay elements.

Reprex

Create some data and serialize in using release (BioC 3.6)

library(DelayedArray)
x <- DelayedArray(matrix(1:10, ncol = 2))
y <- x[1:3, ]
saveRDS(y, "~/test.rds")
Session info ```R sessionInfo() R version 3.4.3 Patched (2018-01-20 r74142) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux Server release 6.9 (Santiago) Matrix products: default BLAS: /jhpce/shared/jhpce/core/conda/miniconda-3/envs/svnR-3.4.x/R/3.4.x/lib64/R/lib/libRblas.so LAPACK: /jhpce/shared/jhpce/core/conda/miniconda-3/envs/svnR-3.4.x/R/3.4.x/lib64/R/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] DelayedArray_0.4.1 IRanges_2.12.0 S4Vectors_0.16.0 [4] BiocGenerics_0.24.0 matrixStats_0.53.1 loaded via a namespace (and not attached): [1] compiler_3.4.3 ```

Load the data using devel (BioC 3.7)

library(DelayedArray)
y <- readRDS("~/test.rds")
validObject(y)
#> [1] TRUE
y
#> Error in new("standardGeneric", .Data = function (object)  :
#>   DelayedMatrix object uses internal representation from DelayedArray < 0.5.11
#>   and cannot be displayed or used. Please update it with:
#>     object <- updateObject(object, verbose=TRUE)
updateObject(y, verbose = TRUE)
#> updateObject(object="ANY") default for object of class 'matrix'
#> [updateObject] DelayedMatrix object uses internal representation from
#> [updateObject] DelayedArray < 0.5.11. Updating it ...
#> Error in initialize(value, ...) :
#>   invalid names for slots of class “DelayedMatrix”: index, delayed_ops
Session info ```R sessionInfo() R Under development (unstable) (2018-03-21 r74433) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Red Hat Enterprise Linux Server release 6.9 (Santiago) Matrix products: default BLAS: /jhpce/shared/jhpce/core/conda/miniconda-3/envs/svnR-devel/R/devel/lib64/R/lib/libRblas.so LAPACK: /jhpce/shared/jhpce/core/conda/miniconda-3/envs/svnR-devel/R/devel/lib64/R/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices datasets utils [8] methods base other attached packages: [1] DelayedArray_0.5.30 BiocParallel_1.13.3 IRanges_2.13.28 [4] S4Vectors_0.17.41 BiocGenerics_0.25.3 matrixStats_0.53.1 [7] devtools_1.13.5 loaded via a namespace (and not attached): [1] compiler_3.5.0 tools_3.5.0 withr_2.1.2 memoise_1.1.0 knitr_1.20 [6] digest_0.6.15 ```
mtmorgan commented 6 years ago

Just a note that @hpages is out of the office for the next week, so response may be delayed.

PeteHaitch commented 6 years ago

Thanks, Martin

On Mon., 16 Apr. 2018, 5:57 pm Martin Morgan, notifications@github.com wrote:

Just a note that @hpages https://github.com/hpages is out of the office for the next week, so response may be delayed.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Bioconductor/DelayedArray/issues/17#issuecomment-381763310, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAEjS47zau0NkEVcBcgmLsqXuSR4pTWks5tpRPXgaJpZM4TXTTh .

hpages commented 6 years ago

Hi Pete,

I actually changed the DelayedArray internals twice during this devel cycle. A 1st time in version 0.5.11 (commit 44308c1f02f6b5bf7e82a300213ef3a6f0e4d26d from Dec 22, 2017), and a 2nd time more recently in version 0.5.24 (commit 6a3febf165f15101036a7adf6fb9a4830bda35f2 from Apr 3, 2018). Each time I also provided an updateObject method but unfortunately the 2nd time the method was only working on objects using the previous internals. More precisely updateObjects() was able to update from internals v1 to internals v2 but not from internals v0 to internals v2. The error you get in your example above is because y is using internals v0.

I just fixed updateObjects() in DelayedArray 0.5.33 (commit 7f2d1bf6ea5bf3cbd338ee81a66f991933adc620) so it should be able to update objects using internals v0.

I'll update the updateObject method for SummarizedExperiment objects later today.

Sorry for the inconvenience.

PeteHaitch commented 6 years ago

Thanks, Hervé!

hpages commented 6 years ago

@PeteHaitch Hey Pete, an update on this:

With SummarizedExperiment 1.9.18, calling updateObject() on a SummarizedExperiment object now updates its assays. Note that for this to work on a BSseq object that has "old" DelayedArray objects in its assays, I had to make a small tweak to the "updateObject" method for BSseq objects. See commit c70c9e345cf9f708881c1cfb3ce60eb4d769a03a in bsseq. With these changes, I was able to call updateObject() on exampleBSseqObj and get a valid BSseq instance.

PeteHaitch commented 6 years ago

Thanks!

j-lawson commented 6 years ago

Thanks for the updates Hervé!