statOmics / msqrob2

Implementation of the MSqRob analysis of differentially expressed proteins using the Features infrastructure
9 stars 10 forks source link

`msqrob()` breaks when analysing a set that contains a subset of cells #59

Open cvanderaa opened 5 months ago

cvanderaa commented 5 months ago

Reproducible example:

se1 <- SummarizedExperiment(assays = matrix(100, 10, 10, dimnames = list(letters[1:10], LETTERS[1:10])))
se2 <- SummarizedExperiment(assays = matrix(50, 10, 5, dimnames = list(letters[1:10], LETTERS[6:10])))
cd <- DataFrame(foo = paste0("bar", rep(1:2, 5)), row.names = LETTERS[1:10])
qf <- QFeatures(experiments = List(set1 = se1, set2 = se2), colData = cd)

msqrob(qf, i = "set1",  formula = ~ foo)
## Works ok

msqrob(qf, i = "set2",  formula = ~ foo)
## Breaks

msqrob(qf[, , "set2"], i = "set2",  formula = ~ foo)
## Works ok

set2 contains a subset of samples in set1. The function breaks because de colData is not correctly retrieved here: https://github.com/statOmics/msqrob2/blob/7b2399da99e81c40ec74e1b9b5a78b55f2e6c5ab/R/msqrob-methods.R#L216 and https://github.com/statOmics/msqrob2/blob/7b2399da99e81c40ec74e1b9b5a78b55f2e6c5ab/R/msqrob-methods.R#L224

I would suggest to first extract (within the msqrob()method) the set of interest with getWithColData(), which will automatically manage the colData.