LTLA / scuttle

Clone of the Bioconductor repository for the scuttle package.
https://bioconductor.org/packages/devel/bioc/html/scuttle.html
9 stars 7 forks source link

aggregateAcrossCells() no longer works over multiple assays in SCE #17

Open AnnaAMonaco opened 1 year ago

AnnaAMonaco commented 1 year ago

Hi,

I have been using the scuttle package functions quite some time now to analyse my scRNA-seq data (10x, aligned with salmon alevin), and at this point have a pretty streamlined code for what I need to do. In this specific case I am trying to make a new sce from subclustering my original sce, carrying over three specific assays to the new object.

Today all of a sudden my code for AggregateAcrossCells() -- which I have been using over and over again without modifications -- doesn't work anymore:

> sce
class: SingleCellExperiment 
dim: 9984 9466 
metadata(0):
assays(5): counts logcounts scaledata ratio mel_counts
rownames(9984): pHCl-1 CG9706 ... Nup93-1 CG17841
rowData names(0):
colnames(9466): CATCCACTCGTCAACA GGATCTAGTCGCTCGA ... ATCACAGGTTCAGGTT
  TGCTTGCTCAGCACCG
colData names(19): orig.ident nCount_RNA ... cell_type subclusters
reducedDimNames(2): PCA UMAP
mainExpName: RNA
altExpNames(1): ratio

> meta.sce <- aggregateAcrossCells(sce, ids=colData(sce)$subclusters, use.assay.type=c("counts", "logcounts", "mel_counts"))
Error in .use_names_to_integer_indices(use.assay.type, x = x, nameFUN = assayNames,  : 
  'use.assay.type' contains invalid values
In addition: Warning message:
'use.altexps=' is deprecated.
Use 'applySCE(x, aggregateAcrossCells)' instead.

I assumed there has been some update to the package, but now I cannot get the new sce that I need, which will contain all three above assays: counts, logcounts, and mel_counts. I tried looking into applySCE() and running it, but I can only get it to work for "counts".

> applySCE(sce, aggregateAcrossCells, ids=colData(sce)$subclusters)
class: SingleCellExperiment 
dim: 9984 89 
metadata(0):
assays(1): counts
rownames(9984): pHCl-1 CG9706 ... Nup93-1 CG17841
rowData names(0):
colnames(89): Amnioserosa Cardiac Primordium.1 ... Visceral Muscle.7
  Yolk Nuclei
colData names(21): orig.ident nCount_RNA ... ids ncells
reducedDimNames(2): PCA UMAP
mainExpName: RNA
altExpNames(1): ratio

I appended logcounts and mel_counts as alternative experiments, because that seems to be what the function now works with, but it still won't work.

> altExp(sce, "logcounts") <- SingleCellExperiment(assays=list(counts=assay(sce, "logcounts")))
> altExp(sce, "mel_counts") <- SingleCellExperiment(assays=list(counts=assay(sce, "mel_counts")))

> applySCE(sce, aggregateAcrossCells, ids=colData(sce)$subclusters, MAIN.ARGS=c("counts"), ALT.ARGS=c("logcounts", "mel_counts"))
Error in ALT.ARGS[[current]] : subscript out of bounds
In addition: Warning message:
'use.altexps=' is deprecated.
Use 'applySCE(x, aggregateAcrossCells)' instead.

# Alternatively
> applySCE(sce, aggregateAcrossCells, ids=colData(sce)$subclusters, MAIN.ARGS=c("counts"), WHICH=c("logcounts", "mel_counts"))
class: SingleCellExperiment 
dim: 9984 89 
metadata(0):
assays(1): counts
rownames(9984): pHCl-1 CG9706 ... Nup93-1 CG17841
rowData names(0):
colnames(89): Amnioserosa Cardiac Primordium.1 ... Visceral Muscle.7
  Yolk Nuclei
colData names(21): orig.ident nCount_RNA ... ids ncells
reducedDimNames(2): PCA UMAP
mainExpName: RNA
altExpNames(3): ratio logcounts mel_counts
Warning message:
'use.altexps=' is deprecated.
Use 'applySCE(x, aggregateAcrossCells)' instead.

I have currently resorted to just adding the altExps as assays after aggregating, but I would like to know if there is a more straightforward way of doing this.

assay(meta.sce, "logcounts", withDimnames=FALSE) <- altExp(meta.sce, "logcounts")
assay(meta.sce, "mel_counts", withDimnames=FALSE) <- altExp(meta.sce, "mel_counts")
meta.sce
class: SingleCellExperiment 
dim: 9984 89 
metadata(0):
assays(3): counts logcounts mel_counts
rownames(9984): pHCl-1 CG9706 ... Nup93-1 CG17841
rowData names(0):
colnames(89): Amnioserosa Cardiac Primordium.1 ... Visceral Muscle.7
  Yolk Nuclei
colData names(21): orig.ident nCount_RNA ... ids ncells
reducedDimNames(2): PCA UMAP
mainExpName: RNA
altExpNames(3): ratio logcounts mel_counts

Thank you in advance for your reply, and thank you for the super useful package in general :)

sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Rocky Linux 8.5 (Green Obsidian)

Matrix products: default
BLAS/LAPACK: /fast/work/users/amonaco_m/miniconda3/envs/R4/lib/libopenblasp-r0.3.20.so

locale:
 [1] LC_CTYPE=C                 LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] tibble_3.1.7                dplyr_1.0.9                
 [3] scales_1.2.0                scater_1.24.0              
 [5] ggplot2_3.3.6               scran_1.24.0               
 [7] scuttle_1.6.2               SingleCellExperiment_1.18.0
 [9] SummarizedExperiment_1.26.1 Biobase_2.56.0             
[11] GenomicRanges_1.48.0        GenomeInfoDb_1.32.2        
[13] IRanges_2.30.0              S4Vectors_0.34.0           
[15] BiocGenerics_0.42.0         MatrixGenerics_1.8.1       
[17] matrixStats_0.62.0   
LTLA commented 1 year ago

I can only assume that you haven't run aggregateAcrossCells for a while, because that behavior hasn't changed within the current release. Or for many previous releases, actually.

Anyway, the likely problem is that aggregateAcrossCells is attempting to apply the same aggregation to the alternative Experiment ratio, which I assume only has the counts assay. So:

AnnaAMonaco commented 1 year ago

I just tried running with use.altexps=FALSE as you recommend here, but I still only get "counts" as an assay for the aggregated object

> applySCE(sce, aggregateAcrossCells, ids=colData(sce)$subclusters, use.altexps=FALSE)
class: SingleCellExperiment 
dim: 9984 89 
metadata(0):
assays(1): counts
rownames(9984): pHCl-1 CG9706 ... Nup93-1 CG17841
rowData names(0):
colnames(89): Amnioserosa Cardiac Primordium.1 ... Visceral Muscle.7
  Yolk Nuclei
colData names(21): orig.ident nCount_RNA ... ids ncells
reducedDimNames(2): PCA UMAP
mainExpName: RNA
altExpNames(1): ratio
Warning messages:
1: 'use.altexps=' is deprecated.
Use 'applySCE(x, aggregateAcrossCells)' instead. 
2: 'use.altexps=' is deprecated.
Use 'applySCE(x, aggregateAcrossCells)' instead.

I'm trying to get an aggregation with three of the assays: counts, logcounts, and mel_counts

> sce
class: SingleCellExperiment 
dim: 9984 9466 
metadata(0):
assays(5): counts logcounts scaledata ratio mel_counts
rownames(9984): pHCl-1 CG9706 ... Nup93-1 CG17841
rowData names(0):
colnames(9466): CATCCACTCGTCAACA GGATCTAGTCGCTCGA ... ATCACAGGTTCAGGTT
  TGCTTGCTCAGCACCG
colData names(19): orig.ident nCount_RNA ... cell_type subclusters
reducedDimNames(2): PCA UMAP
mainExpName: RNA
altExpNames(1): ratio
LTLA commented 1 year ago

To be clear, you need to set both use.altexps=FALSE and use.assay.type= to your desired assays. The default behavior of aggregateAcrosCells is to only operate on the counts. I believe this has been the case for the past few releases.