waldronlab / MultiAssayExperiment

Bioconductor package for management of multi-assay data
https://waldronlab.io/MultiAssayExperiment/
69 stars 32 forks source link

1.27.1 constructor breaks with out-of-order sample mapping columns #326

Closed LTLA closed 1 year ago

LTLA commented 1 year ago

This used to work.

library(SummarizedExperiment)
rna.counts <- matrix(rpois(60, 10), ncol=6)
colnames(rna.counts) <- c("disease1", "disease2", "disease3", "control1", "control2", "control3")
rownames(rna.counts) <- c("ENSMUSG00000000001", "ENSMUSG00000000003", "ENSMUSG00000000028", 
    "ENSMUSG00000000031", "ENSMUSG00000000037", "ENSMUSG00000000049",  "ENSMUSG00000000056", 
    "ENSMUSG00000000058", "ENSMUSG00000000078",  "ENSMUSG00000000085")
rna.se <- SummarizedExperiment(list(counts=rna.counts))
colData(rna.se)$disease <- rep(c("disease", "control"), each=3)

chip.counts <- matrix(rpois(100, 10), ncol=4)
colnames(chip.counts) <- c("disease1", "disease2", "control1", "control3")
chip.peaks <- GRanges("chr1", IRanges(1:25*100+1, 1:25*100+100))
chip.se <- SummarizedExperiment(list(counts=chip.counts), rowRanges=chip.peaks)

library(MultiAssayExperiment)
mapping <- DataFrame(
    primary = c(colnames(rna.se), colnames(chip.se)), # sample identifiers
    assay = rep(c("rnaseq", "chipseq"), c(ncol(rna.se), ncol(chip.se))), # experiment name
    colname = c(colnames(rna.se), colnames(chip.se)) # column names inside each experiment
)
mae <- MultiAssayExperiment(list(rnaseq=rna.se, chipseq=chip.se), sampleMap=mapping)

Now it "harmonizes" all of the data away:

harmonizing input:
  removing 10 sampleMap rows not in names(experiments)
  removing 6 colData rownames not in sampleMap 'primary'
Warning message:
sampleMap[['primary']] coerced with as.factor() 

Instead it requires the sampleMap to have the assay column first:

mae <- MultiAssayExperiment(list(rnaseq=rna.se, chipseq=chip.se), sampleMap=mapping[,c(2,3,1)])

This was not previously required and can be expected to break some percentage of user code in the wild. It certainly broke alabaster.mae's build. If this new requirement is intended, a deprecation process involving a sensible warning-to-error message would make more sense than near-silently blasting away all of the data in the MAE.

LiNk-NY commented 1 year ago

Thanks! The order of the columns are fixed in the constructor now.