Bioconductor / SummarizedExperiment

A container (S4 class) for matrix-like assays
https://bioconductor.org/packages/SummarizedExperiment
33 stars 9 forks source link

colnames-related concerns #35

Closed vjcitn closed 4 years ago

vjcitn commented 4 years ago

# let se denote a SummarizedExperiment instance
# this code shows that
# 1) colnames(assay(se))<- is a no-op and should probably throw an error
# 2) colData(se)<- modifies assay colnames in a way I found surprising.  This
# operation COULD attempt to reorder rows of the DataFrame supplied to conform
# with available assay colnames ... should it?  This is with 1.17.1
# These were identified by Sara Stankiewicz, a new member of our group

library(SummarizedExperiment)
mymat = matrix((1:6)*1.0, nc=3)
dimnames(mymat) = list(c("A", "B"), letters[1:3])
se1 = SummarizedExperiment(mymat)
validObject(se1)
oldAa = assay(se1["A", "a"])
oldAa
oldcn = colnames(assay(se1))
colnames(assay(se1)) <- letters[4:6] # silent, should fail?
any(colnames(assay(se1)) == oldcn)  
library(S4Vectors)
ndf = DataFrame(v1=10:12)
rownames(ndf) = letters[3:1]
ndf
colData(se1) = ndf 
newAa = assay(se1["A", "a"])
newAa
newAa == oldAa # would work if colData rows reordered to conform
hpages commented 4 years ago

Thanks Vince.

About 1):

Interestingly we have a similar situation with mcols() on Vector derivatives:

ir <- IRanges(1:3, 5)
mcols(ir) <- DataFrame(stuff=runif(3))
rownames(mcols(ir)) <- letters[1:3]
rownames(mcols(ir))
# NULL

This should be taken care of in SummarizedExperiment 1.17.2 (see commit 495ab00f).

Note that when you did colnames(assay(se1)) <- letters[4:6] in your example above the colnames of the assay actually got modified. It's just that you didn't see them when you brought the assay back because by default assay() doesn't bring back the real colnames. You need to use withDimnames=FALSE for that:

> assay(se1, withDimnames=FALSE)
  d e f
A 1 3 5
B 2 4 6

About 2):

Having the colData() setter reorder the rows of the supplied DataFrame "might" sound like a nice feature but it breaks the general assumption that the getter will bring back the original DataFrame so I'm not convinced it's a good idea. Anyway if you want to pursue this please submit the proposal in a separate issue.

LTLA commented 4 years ago

The latest change broke some edge-case behavior:

library(SummarizedExperiment)
example(SummarizedExperiment) # create some SE's.

assay(se) <- assay(se) # OK.

empty <- se[,0]
assay(empty) <- assay(empty) # fail
hpages commented 4 years ago

Thanks @LTLA (when following up on a closed issue, you should re-open)

hpages commented 4 years ago

Should be addressed in SummarizedExperiment 1.17.3 (see commit 7ee31d49).

Please re-open or start a new issue if you run into other problems related with commit 495ab00f ("By default assays()/assay() setters reject inconsistent dimnames"). Thanks!

LTLA commented 4 years ago

Thanks @hpages. I don't have the authority to re-open issues on this repo, so yes, I should have started a new one.