carmonalab / ProjecTILs

Interpretation of cell states using reference single-cell maps
GNU General Public License v3.0
231 stars 27 forks source link

Building custom reference and retain the RNA assay #56

Closed WouterMidden closed 8 months ago

WouterMidden commented 1 year ago

Hi, I'm trying to compare the gene expression between a custom-built reference dataset from a published dataset and my own data, using find.discriminant.genes and plot.states.radar. However, I get an error that the "RNA" assay is not present in my reference. Upon inspection, only the "integrated" assay is present. I tried to alter the make.reference chunk to set the assay = "RNA", but this did not change anything.

My code for make.reference is as follows:

> Assays(seurat_obj)
[1] "RNA"
> ref <- make.reference(ref = seurat_obj, 
                                   assay = "RNA", 
                                   ndim = ndim, 
                                   recalculate.umap = TRUE,
                                   annotation.column = "cell_type"
                                   )
> Assays(ref)
[1] "integrated"

Also, when I follow the vignette (https://carmonalab.github.io/ProjecTILs.demo/build_ref_atlas.html) and use the provided data in the vignette, make.reference only returns the "integrated" assay.

> Assays(mo.mac)
[1] "RNA"
> ref.momac <- make.reference(ref = mo.mac, ndim = ndim, seed = seed, recalculate.umap = TRUE,
     annotation.column = "Minor.subset")
> Assays(ref.momac)
[1] "integrated"

The provided references with load.reference.map(), such as _ref_TILAtlas_mousev1.rds and _DC_human_refv1.rds, contain however both the RNA and integrated assay.

> ref_T <- load.reference.map("ref_TILAtlas_mouse_v1.rds")
> ref_DC <- load.reference.map("DC_human_ref_v1.rds")
> Assays(ref_T)
[1] "RNA"        "integrated"
> Assays(ref_DC)
[1] "RNA"        "integrated"

If necessary, I can provide my code, but as the vignette gives the same results, I'm wondering how one could tweak this.

When I tried to change the name of the assay of the reference (which did not work), it is interesting that it says that the active assay is actually RNA:

> Assays(ref)
[1] "integrated"
> RenameAssays(object = ref, "integrated" = 'RNA')
Renaming default assay from integrated to RNA
Warning: Cannot add objects with duplicate keys (offending key: rna_) setting key to original value 'rnaga_'
An object of class Seurat 
33538 features across 22863 samples within 1 assay 
Active assay: RNA (33538 features, 2000 variable features)
 2 dimensional reductions calculated: pca, umap

Is there a way to build a custom reference dataset that retains the RNA assay, or is there another way to circumvent this issue? Thanks a lot!

> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Matrix_1.5-4           GEOquery_2.66.0        Biobase_2.58.0         BiocGenerics_0.44.0    umap_0.2.10.0         
 [6] ProjecTILs_3.0.3       EnhancedVolcano_1.16.0 ggrepel_0.9.3          UCell_2.2.0            scGate_1.4.1          
[11] data.table_1.14.8      lubridate_1.9.2        forcats_1.0.0          stringr_1.5.0          purrr_1.0.1           
[16] readr_2.1.4            tidyr_1.3.0            tibble_3.2.1           tidyverse_2.0.0        ggplot2_3.4.2         
[21] cowplot_1.1.1          magrittr_2.0.3         patchwork_1.1.2        dplyr_1.1.1            harmony_0.1.1         
[26] Rcpp_1.0.10            SeuratObject_4.1.3     Seurat_4.3.0
mass-a commented 1 year ago

Hello Wouter, thanks for reporting this.

The easiest workaround is to make a copy of the RNA assay before running make.reference:

mo.mac@assays$integrated <- mo.mac@assays$RNA
DefaultAssay(mo.mac) <- "integrated"
ref.momac <- make.reference(ref=mo.mac, ndim=ndim, seed=seed,
                            recalculate.umap = TRUE,
                            annotation.column = "Minor.subset")
Assays(ref.momac)
[1] "RNA"        "integrated"

I agree the current behavior is not ideal, and will correct it in the next update. Thanks for reporting!

Best -m