satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 915 forks source link

SCTransform (v2 regularization) and integration, AverageExpression #7501

Closed okamo9chi closed 1 year ago

okamo9chi commented 1 year ago

Hello,

I have a few questions regarding the subject mentioned. I would appreciate it if you could kindly address them. When running the code as shown below, I encounter the following warning message. Is it safe to ignore this warning? The useNames argument is not present in the code or function mentioned.

Apply sctransform normalization

OKBT007 <- SCTransform(OKBT007, vst.flavor = "v2", verbose = FALSE) OKBT019 <- SCTransform(OKBT019, vst.flavor = "v2", verbose = FALSE) Warning:

useNames = NA is deprecated. Instead, specify either useNames = TRUE or useNames = TRUE.

Furthermore, in the latest version (V2), is the parameter vars.to.regress = "percent.mt" no longer necessary?

On the following website (https://github.com/satijalab/seurat/issues/2413), it states that for the AverageExpression, we should use SCT normalized data instead of the integrated assay. Does this refer to the data slot of the SCT assay, rather than scale.data?

Thank you in advance for your assistance !

saketkc commented 1 year ago

The warnings are ignorable in this case. vars.to.regress runs a second round of regression on the pearson residuals (scale.data slot of SCT). You can pass any covaraites to this argument (this has not changed between V1 and V2). If you want AverageExpression, you should use the data slot of SCT assay which stores the corrected counts.

okamo9chi commented 1 year ago

I would like to express my gratitude for your invaluable guidance thus far. However, I have a few additional questions.

Could you kindly enlighten me on whether there are any scenarios where the scale data slot can be employed within the SCT assay? Basically, should I just use the data slot ?

Thank you very much for your time and assistance. I look forward to hearing from you.

saketkc commented 1 year ago

scale.data stores the pearson residuals - these are good for downstream clustering/PCA/UMAP/integration/transfer.

data stores the corrected counts - these should be used for visualization/DE. Hope this helps!

okamo9chi commented 1 year ago

Thank you very much !

Sorry for the last question, is it recommended to run vars.to.progress = "percent.mt"? ("Introduction to SCTransform, v2 regularization" [https://satijalab.org/seurat/articles/sctransform_v2_vignette.html] don't include this argument.)

Or would it be less necessary if I had done preprocess as follows?

ddOKBT007 <- subset(ddOKBT007, subset = nFeature_RNA > 200 & nFeature_RNA < 5000 & percent.mt < 20) ddOKBT019 <- subset(ddOKBT019, subset = nFeature_RNA > 200 & nFeature_RNA < 6000 & percent.mt < 20)

Thank you for your help and your time, in advance.

saketkc commented 1 year ago

Including the vars.to.regress argument is very context dependent. In the vignette, including it would make no practical difference. It depends on your dataset. same argument holds for subsetting - the numbers are very context dependent.

okamo9chi commented 1 year ago

Thank you for all your kindness ! We look forward to working with you in the future.

anemartinezlarrinaga2898 commented 11 months ago

Hello!

Im getting the same message but instead of being a Warning is an error:

Error: useNames = NA is defunct. Instead, specify either useNames = TRUE or useNames = FALSE.

Could anyone help?

Thanks!

JohnScanlan commented 10 months ago

I am also getting the same error as @anemartinezlarrinaga2898. Using the same code and package versions of SCTransform and Seurat it was working a week ago.

AndrewHardigan commented 10 months ago

I am also getting this as error message (not a warning) that prevents further analyses of the RNA/SCT (using the WNN analysis of 10x Multiome, RNA + ATAC Vignette as a guide)

Project: 10X Multiome dataset 11,775 nuclei/barcodes after filtering using WNN as a guide.

My code was multiome <- SCTransform(multiome, verbose = FALSE) %>% RunPCA() %>% RunUMAP(dims = 1:50, reduction.name = 'umap.rna', reduction.key = 'rnaUMAP_')

Which returned: Error: useNames = NA is defunct. Instead, specify either useNames = TRUE or useNames = FALSE.

To narrow down which command prompted this error, I started again and it occurs during the SCTransform call:

DefaultAssay(multiome) <- "RNA"
multiome <- SCTransform(multiome, verbose = FALSE)
Error: useNames = NA is defunct. Instead, specify either useNames = TRUE or useNames = FALSE.

I first ran the 10k PBMC Multiome WNN Vignette code (copied below) yesterday without issue, however strangely I now also get the same error when I run the same code:


setwd("10X_Test_Data_Seurat_Vignette/")
rm(list=ls())
library(Seurat)
library(Signac)
library(EnsDb.Hsapiens.v86)
library(dplyr)
library(ggplot2)

# the 10x hdf5 file contains both data types. 
inputdata.10x <- Read10X_h5("pbmc_granulocyte_sorted_10k_filtered_feature_bc_matrix.h5")

# extract RNA and ATAC data
rna_counts <- inputdata.10x$`Gene Expression`
atac_counts <- inputdata.10x$Peaks

# Create Seurat object
pbmc <- CreateSeuratObject(counts = rna_counts)
pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-")

# Now add in the ATAC-seq data
# we'll only use peaks in standard chromosomes
length(rownames(atac_counts))
grange.counts <- StringToGRanges(rownames(atac_counts), sep = c(":", "-"))
grange.use <- seqnames(grange.counts) %in% standardChromosomes(grange.counts)
atac_counts <- atac_counts[as.vector(grange.use), ]
annotations <- GetGRangesFromEnsDb(ensdb = EnsDb.Hsapiens.v86)
warnings()
seqlevelsStyle(annotations) <- 'UCSC'
genome(annotations) <- "hg38"

frag.file <- "pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz"
chrom_assay <- CreateChromatinAssay(
  counts = atac_counts,
  sep = c(":", "-"),
  genome = 'hg38',
  fragments = frag.file,
  min.cells = 10,
  annotation = annotations
)
pbmc[["ATAC"]] <- chrom_assay

#We perform basic QC based on the number of detected molecules for each modality as well as mitochondrial percentage.

VlnPlot(pbmc, features = c("nCount_ATAC", "nCount_RNA", "percent.mt"), ncol = 3,
        log = TRUE, pt.size = 0) + NoLegend()

#Vlnplots here

pbmc <- subset(
  x = pbmc,
  subset = nCount_ATAC < 7e4 &
    nCount_ATAC > 5e3 &
    nCount_RNA < 25000 &
    nCount_RNA > 1000 &
    percent.mt < 20
)

#We next perform pre-processing and dimensional reduction on both assays independently, using standard approaches for RNA and ATAC-seq data.

# RNA analysis
DefaultAssay(pbmc) <- "RNA"
pbmc <- SCTransform(pbmc, verbose = FALSE) %>% RunPCA() %>% RunUMAP(dims = 1:50, reduction.name = 'umap.rna', reduction.key = 'rnaUMAP_')

Error: useNames = NA is defunct. Instead, specify either useNames = TRUE or useNames = FALSE.
`

Attempting to specify useNames = TRUE or useNames = FALSE per the error message also does not work, and both attempts yield the same error:


`> pbmc <- SCTransform(pbmc, verbose = FALSE, useNames=TRUE)
Error in vst(useNames = TRUE, vst.flavor = "v2", umi = new("dgCMatrix",  : 
  unused argument (useNames = TRUE)
> pbmc <- SCTransform(pbmc, verbose = FALSE, useNames=FALSE)
Error in vst(useNames = FALSE, vst.flavor = "v2", umi = new("dgCMatrix",  : 
  unused argument (useNames = FALSE)`

I do not recall updating or changing any packages since running this code yesterday that would explain this change.

Thank you for your help with this issue!


`> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggpubr_0.6.0              ggplot2_3.4.4             EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.22.0         
 [5] AnnotationFilter_1.22.0   GenomicFeatures_1.50.3    AnnotationDbi_1.60.2      Biobase_2.58.0           
 [9] GenomicRanges_1.50.2      GenomeInfoDb_1.34.9       IRanges_2.32.0            S4Vectors_0.36.2         
[13] BiocGenerics_0.44.0       Signac_1.12.0             dplyr_1.1.4               cowplot_1.1.1            
[17] bmcite.SeuratData_0.3.0   SeuratData_0.2.2.9001     Seurat_5.0.1              SeuratObject_5.0.1       
[21] sp_2.1-2                 

loaded via a namespace (and not attached):
  [1] rappdirs_0.3.3              rtracklayer_1.58.0          scattermore_1.2             R.methodsS3_1.8.2          
  [5] tidyr_1.3.0                 bit64_4.0.5                 knitr_1.45                  irlba_2.3.5.1              
  [9] DelayedArray_0.24.0         R.utils_2.12.2              data.table_1.14.10          rpart_4.1.19               
 [13] KEGGREST_1.38.0             RCurl_1.98-1.13             generics_0.1.3              callr_3.7.3                
 [17] usethis_2.1.6               RSQLite_2.3.4               RANN_2.6.1                  future_1.33.0              
 [21] bit_4.0.5                   spatstat.data_3.0-3         xml2_1.3.3                  httpuv_1.6.13              
 [25] SummarizedExperiment_1.28.0 assertthat_0.2.1            xfun_0.41                   hms_1.1.3                  
 [29] evaluate_0.23               promises_1.2.1              fansi_1.0.6                 restfulr_0.0.15            
 [33] progress_1.2.2              dbplyr_2.3.0                igraph_1.6.0                DBI_1.1.3                  
 [37] htmlwidgets_1.6.4           spatstat.geom_3.2-7         purrr_1.0.2                 ellipsis_0.3.2             
 [41] RSpectra_0.16-1             backports_1.4.1             sparseMatrixStats_1.10.0    biomaRt_2.54.1             
 [45] deldir_2.0-2                MatrixGenerics_1.10.0       vctrs_0.6.5                 remotes_2.4.2              
 [49] ROCR_1.0-11                 abind_1.4-5                 cachem_1.0.8                withr_2.5.2                
 [53] BSgenome_1.66.3             progressr_0.14.0            presto_1.0.0                checkmate_2.3.1            
 [57] sctransform_0.4.1           GenomicAlignments_1.34.1    prettyunits_1.2.0           goftest_1.2-3              
 [61] cluster_2.1.4               dotCall64_1.1-1             lazyeval_0.2.2              crayon_1.5.2               
 [65] hdf5r_1.3.8                 spatstat.explore_3.2-5      pkgconfig_2.0.3             vipor_0.4.5                
 [69] nlme_3.1-161                pkgload_1.3.2               ProtGenerics_1.30.0         nnet_7.3-18                
 [73] devtools_2.4.5              rlang_1.1.2                 globals_0.16.2              lifecycle_1.0.4            
 [77] miniUI_0.1.1.1              filelock_1.0.2              fastDummies_1.7.3           BiocFileCache_2.6.0        
 [81] dichromat_2.0-0.1           ggrastr_1.0.2               polyclip_1.10-6             RcppHNSW_0.5.0             
 [85] matrixStats_1.2.0           lmtest_0.9-40               Matrix_1.6-4                carData_3.0-5              
 [89] zoo_1.8-11                  beeswarm_0.4.0              base64enc_0.1-3             ggridges_0.5.4             
 [93] processx_3.8.0              png_0.1-8                   viridisLite_0.4.2           rjson_0.2.21               
 [97] bitops_1.0-7                R.oo_1.25.0                 KernSmooth_2.23-20          spam_2.10-0                
[101] Biostrings_2.66.0           DelayedMatrixStats_1.20.0   blob_1.2.4                  stringr_1.5.1              
[105] parallelly_1.36.0           spatstat.random_3.2-2       rstatix_0.7.2               ggsignif_0.6.4             
[109] scales_1.3.0                memoise_2.0.1               magrittr_2.0.3              plyr_1.8.9                 
[113] ica_1.0-3                   zlibbioc_1.44.0             compiler_4.2.2              BiocIO_1.8.0               
[117] RColorBrewer_1.1-3          fitdistrplus_1.1-11         Rsamtools_2.14.0            cli_3.6.2                  
[121] XVector_0.38.0              urlchecker_1.0.1            listenv_0.9.0               patchwork_1.1.3            
[125] pbapply_1.7-2               ps_1.7.2                    htmlTable_2.4.2             Formula_1.2-5              
[129] MASS_7.3-58.1               tidyselect_1.2.0            stringi_1.8.3               glmGamPoi_1.10.2           
[133] yaml_2.3.8                  ggrepel_0.9.4               grid_4.2.2                  VariantAnnotation_1.44.1   
[137] fastmatch_1.1-4             tools_4.2.2                 future.apply_1.11.0         parallel_4.2.2             
[141] rstudioapi_0.14             foreign_0.8-84              gridExtra_2.3               farver_2.1.1               
[145] Rtsne_0.17                  digest_0.6.33               BiocManager_1.30.19         shiny_1.8.0                
[149] Rcpp_1.0.11                 car_3.1-2                   broom_1.0.5                 later_1.3.2                
[153] RcppAnnoy_0.0.21            httr_1.4.7                  biovizBase_1.46.0           colorspace_2.1-0           
[157] XML_3.99-0.16               fs_1.6.3                    tensor_1.5                  reticulate_1.34.0          
[161] splines_4.2.2               uwot_0.1.16                 RcppRoll_0.3.0              spatstat.utils_3.0-4       
[165] plotly_4.10.3               sessioninfo_1.2.2           xtable_1.8-4                jsonlite_1.8.8             
[169] R6_2.5.1                    profvis_0.3.7               Hmisc_5.1-1                 pillar_1.9.0               
[173] htmltools_0.5.7             mime_0.12                   glue_1.6.2                  fastmap_1.1.1              
[177] BiocParallel_1.32.6         codetools_0.2-18            pkgbuild_1.4.0              utf8_1.2.4                 
[181] lattice_0.20-45             spatstat.sparse_3.0-3       tibble_3.2.1                ggbeeswarm_0.7.2           
[185] curl_5.2.0                  leiden_0.4.3.1              survival_3.5-0              rmarkdown_2.25             
[189] munsell_0.5.0               GenomeInfoDbData_1.2.9      reshape2_1.4.4              gtable_0.3.4    `
saketkc commented 10 months ago

Error: useNames = NA is defunct. Instead, specify either useNames = TRUE or useNames = FALSE. seems to be coming from a change in matrixStats 1.2 that released a couple of days ago. Can you downgrade to 1.1 and try?

remotes::install_version("matrixStats", version="1.1.0") # restart your session and run previous scripts
AndrewHardigan commented 10 months ago

Hi Saket,

I can confirm that removing matrixStats and following your above fix to install matrixStats version 1.1.0 has fixed this issue.

Thank you very much for your help! (and especially for being so quick with the fix!)

Best regards

-Drew

saketkc commented 10 months ago

On digging deeper, this seems to be coming from glmGamPoi (note that you are using 1.10, while the latest is 1.14). My recommendation would be to update to latest Bioconductor (say yes to all package updates):

BiocManager::install(version = "3.18")

which should fix everything (even if you have matrixStats 1.2)

nitinmahajan20 commented 9 months ago

I am getting the similar error and tried all the things mentioned above BUT still


Error: useNames = NA is defunct. Instead, specify either useNames = TRUE or useNames = FALSE. 
priyanka8590 commented 9 months ago

Downgrading the matrixStats package to 1.1.0 worked for me. Thank you so much!

hezuoxi commented 8 months ago

downgrade matrixStats also works for me,Thank you so much!

bioinfotec commented 8 months ago

Error: useNames = NA is defunct. Instead, specify either useNames = TRUE or useNames = FALSE. seems to be coming from a change in matrixStats 1.2 that released a couple of days ago. Can you downgrade to 1.1 and try?

remotes::install_version("matrixStats", version="1.1.0") # restart your session and run previous scripts

This is helpful, than you!

Victor-K27 commented 7 months ago

Just a question, but downgrading the MatrixStats converts the message from an Error to a Warning for me. Is that the same for you?

betaimmunologist commented 7 months ago

@Victor-K27 Same for me... Have you figured out the reason?