Closed charles-plessy closed 4 years ago
Hi @charles-plessy, does this example correctly highlight what you want to do? I think we might have inadvertantly completely taken away the ability to do this.
suppressPackageStartupMessages(library(MultiAssayExperiment))
example("MultiAssayExperiment", echo = FALSE)
mae
#> A MultiAssayExperiment object of 4 listed
#> experiments with user-defined names and respective classes.
#> Containing an ExperimentList class object of length 4:
#> [1] Affy: SummarizedExperiment with 5 rows and 4 columns
#> [2] Methyl450k: matrix with 5 rows and 5 columns
#> [3] RNASeqGene: matrix with 5 rows and 4 columns
#> [4] GISTIC: RangedSummarizedExperiment with 5 rows and 3 columns
#> Features:
#> experiments() - obtain the ExperimentList instance
#> colData() - the primary/phenotype DFrame
#> sampleMap() - the sample availability DFrame
#> `$`, `[`, `[[` - extract colData columns, subset, or experiment
#> *Format() - convert into a long or wide DFrame
#> assays() - convert ExperimentList to a SimpleList of matrices
Now we would like to re-create the mae object starting from an empty MultiAssayExperiment. We can't do it this way because the colData are empty:
mae2 <- MultiAssayExperiment()
c(mae2, experiments(mae)[1], sampleMap = sampleMap(mae))
#> Error in validObject(object): invalid class "MultiAssayExperiment" object:
#> All samples in the 'sampleMap' must be in the 'colData'
But there is no way to create or rbind
colData in the above example. We also can't start by adding colData or sampleMap because these get harmonized:
colData(mae2) <- colData(mae)
#> harmonizing input:
#> removing 4 colData rownames not in sampleMap 'primary'
sampleMap(mae2) <- sampleMap(mae)
#> harmonizing input:
#> removing 16 sampleMap rows not in names(experiments)
I do think we should be able to assemble an arbitrary MultiAssayExperiment by appending to an empty one (MultiAssayExperiment()
). Two ways to achieve this this could be:
colData
in the colData<-
replace method (potentially, don't harmonize sampleMap
either in the replace method). Or more permissively, only harmonize these during subset methods and allow them to contain colData or maps for non-existent samples otherwise.colData
argument to c,MultiAssayExperiment-method
. These aren't mutually exclusive. From a user perspective, I think 2 only would be somewhat of a pain if you have to provide a correctly subset colData for every c
operation. @LiNk-NY?
As a unit test, we should be able to re-build mae2
above to be the same as mae
.
Session Info:
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R Under development (unstable) (2020-04-03 r78147)
#> os Debian GNU/Linux 10 (buster)
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Etc/UTC
#> date 2020-04-19
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.1.0)
#> backports 1.1.5 2019-10-02 [2] CRAN (R 4.1.0)
#> Biobase * 2.47.3 2020-03-16 [1] Bioconductor
#> BiocGenerics * 0.33.3 2020-03-23 [1] Bioconductor
#> bitops 1.0-6 2013-08-17 [1] CRAN (R 4.1.0)
#> callr 3.4.3 2020-03-28 [2] CRAN (R 4.1.0)
#> cli 2.0.2 2020-02-28 [2] CRAN (R 4.1.0)
#> crayon 1.3.4 2017-09-16 [2] CRAN (R 4.1.0)
#> DelayedArray * 0.13.12 2020-04-10 [1] Bioconductor
#> desc 1.2.0 2018-05-01 [2] CRAN (R 4.1.0)
#> devtools 2.2.2 2020-02-17 [2] CRAN (R 4.1.0)
#> digest 0.6.25 2020-02-23 [2] CRAN (R 4.1.0)
#> ellipsis 0.3.0 2019-09-20 [2] CRAN (R 4.1.0)
#> evaluate 0.14 2019-05-28 [2] CRAN (R 4.1.0)
#> fansi 0.4.1 2020-01-08 [2] CRAN (R 4.1.0)
#> fs 1.4.0 2020-03-31 [2] CRAN (R 4.1.0)
#> GenomeInfoDb * 1.23.17 2020-04-13 [1] Bioconductor
#> GenomeInfoDbData 1.2.3 2020-04-19 [1] Bioconductor
#> GenomicRanges * 1.39.3 2020-03-24 [1] Bioconductor
#> glue 1.4.0 2020-04-03 [2] CRAN (R 4.1.0)
#> highr 0.8 2019-03-20 [2] CRAN (R 4.1.0)
#> htmltools 0.4.0 2019-10-04 [2] CRAN (R 4.1.0)
#> IRanges * 2.21.8 2020-03-25 [1] Bioconductor
#> knitr 1.28 2020-02-06 [2] CRAN (R 4.1.0)
#> lattice 0.20-41 2020-04-02 [3] CRAN (R 4.1.0)
#> magrittr 1.5 2014-11-22 [2] CRAN (R 4.1.0)
#> Matrix 1.2-18 2019-11-27 [3] CRAN (R 4.1.0)
#> matrixStats * 0.56.0 2020-03-13 [1] CRAN (R 4.1.0)
#> memoise 1.1.0 2017-04-21 [2] CRAN (R 4.1.0)
#> MultiAssayExperiment * 1.13.21 2020-04-13 [1] Bioconductor
#> pkgbuild 1.0.6 2019-10-09 [2] CRAN (R 4.1.0)
#> pkgload 1.0.2 2018-10-29 [2] CRAN (R 4.1.0)
#> prettyunits 1.1.1 2020-01-24 [2] CRAN (R 4.1.0)
#> processx 3.4.2 2020-02-09 [2] CRAN (R 4.1.0)
#> ps 1.3.2 2020-02-13 [2] CRAN (R 4.1.0)
#> R6 2.4.1 2019-11-12 [2] CRAN (R 4.1.0)
#> Rcpp 1.0.4 2020-03-17 [2] CRAN (R 4.1.0)
#> RCurl 1.98-1.2 2020-04-18 [1] CRAN (R 4.1.0)
#> remotes 2.1.1 2020-02-15 [2] CRAN (R 4.1.0)
#> rlang 0.4.5 2020-03-01 [2] CRAN (R 4.1.0)
#> rmarkdown 2.1 2020-01-20 [1] CRAN (R 4.1.0)
#> rprojroot 1.3-2 2018-01-03 [2] CRAN (R 4.1.0)
#> S4Vectors * 0.25.15 2020-04-04 [1] Bioconductor
#> sessioninfo 1.1.1 2018-11-05 [2] CRAN (R 4.1.0)
#> stringi 1.4.6 2020-02-17 [2] CRAN (R 4.1.0)
#> stringr 1.4.0 2019-02-10 [2] CRAN (R 4.1.0)
#> SummarizedExperiment * 1.17.5 2020-03-27 [1] Bioconductor
#> testthat 2.3.2 2020-03-02 [2] CRAN (R 4.1.0)
#> usethis 1.5.1 2019-07-04 [2] CRAN (R 4.1.0)
#> withr 2.1.2 2018-03-15 [2] CRAN (R 4.1.0)
#> xfun 0.12 2020-01-13 [2] CRAN (R 4.1.0)
#> XVector 0.27.2 2020-03-24 [1] Bioconductor
#> yaml 2.2.1 2020-02-01 [2] CRAN (R 4.1.0)
#> zlibbioc 1.33.1 2020-01-24 [1] Bioconductor
#>
#> [1] /usr/local/lib/R/host-site-library
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/local/lib/R/library
Created on 2020-04-19 by the reprex package (v0.3.0)
I'm not sure why there is a need to add to an empty MultiAssayExperiment
when the constructor function is pretty robust.
Perhaps the easier route is to rbind
/ c
the all the component pieces.
If you're normalizing, it's fairly straightforward to use the c
operation for an assay that already exists in the object.
suppressPackageStartupMessages(library(MultiAssayExperiment))
example(MultiAssayExperiment)
#> [truncated...]
mae2 <- MultiAssayExperiment()
mae0 <- MultiAssayExperiment(
c(experiments(mae2), experiments(mae)[1]), colData = colData(mae),
sampleMap(mae)
)
#> harmonizing input:
#> removing 12 sampleMap rows not in names(experiments)
Created on 2020-04-19 by the reprex package (v0.3.0)
Using the constructor function in this case seems like a workaround required because it’s not possible to concatenate to colData, making it impossible to add assay data for samples not already in colData. Using the constructor would require developers of derived classes to write their own concatenation functions create new objects, also specifying metadata and any other slots specific to the derived class, instead of being able to use a generic concatenation function. Perhaps @charles-plessy can say why they build on an empty object but it seems like a valid approach, and the MultiAssayExperiment API supports both the creation of empty objects and concatenation - but with the catch that there’s no way to add new colData rows.
Actually there is another way that @charles-plessy could probably use right away, concatenating additional MultiAssayExperiment objects (or derived same-class objects), something like this (this exact example will keep duplicating metadata but you get the point):
library(MultiAssayExperiment)
example("MultiAssayExperiment")
mae0 <- MultiAssayExperiment()
c(c(mae0, mae[, , 1]), mae[, , 2])
Thanks for your answers; with them I think I could fix my package.
When I create new CAGEexp
(wrapping MultiAssayExperiment
) objects, they only contain metadata and colData that point to files from which experimental results can be read. Then, a function builds a SummarizedExperiment
object (called tagCountMatrix
above) after parsing and processing the files. This is why I would like to be able to add a new experiment to a MultiAssayExperiment
that does not contain any.
Other functions of the CAGEr package transform the tagCountMatrix
experiment, so I would like to be able to update it. Lastly, the CAGEexp
objects may contain other experiments, that may be added at later stages of the analysis. This is why I need to be able to add new experiments to an object that contains at least one.
So ideally I would like to be able to do blindly something like the following, just like for a list
:
(the exmaple is not entirely syntactically correct, but I think you see the point)
"tagCountMatrixExp<-" <- function (object, value) {
object[["tagCountMatrix"]] <- value
object
}
At the moment I have the following working solution:
"tagCountMatrixExp<-" <- function (object, value) {
# If the object is empty, create a new one with a tagCountMatrix experiment
if (length(experiments(object)) == 0) {
object <- MultiAssayExperiment( experiments = ExperimentList(tagCountMatrix=value)
, colData = colData(object)
, metadata = metadata(object))
class(object) <- structure("CAGEexp", package = "CAGEr")
# If the object is not empty but does not have a tagCountMatrix experiment, add one
} else if (is.null(object[["tagCountMatrix"]])) {
object <- c(object, tagCountMatrix=value)
# If the object is not empty and has a tagCountMatrix experiment, update it.
} else {
object[["tagCountMatrix"]] <- value
}
object
}
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hello, here is a question rephrased from that I posted on bioc-devel last week.
In the CAGEr package, our main class,
CAGEexp
is a wrapper on theMultiAssayExperiment
class. In a typical workflow, we first create aMultiAssayExperiment
object with no experiment in, and then add the experiments one by one as the analysis proceeds.As the checks in MultiAssayExperiment get tighter and tighter, I am struggling to find the appropriate way to add experiments. Could you help me by documenting how to an experiment to an empty
MultiAssayExperiment
object ?In more details, I have a
MultiAssayExperiment
object with no experiments in, and aSummarizedExperiment
object that I want to add as experiment. I also made sure thatrownames(colData(object)) == colnames(value)
. How can I add thisSummarizedExperiment
to theMultiAssayExperiemnt
? I have triedobject[["myExp"]] <- value
andobject <- c(object, myExp=value)
but both of them fail.[Less importantly, but related: If I want to update an experiment (for instance because I added a new assay in a
SummarizedExperiment
, on which I applied a normalisation method), what is the most appropriate way ?]