Closed amberbangma closed 3 years ago
sessionInfo() R version 4.0.3 (2020-10-10) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.7
Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale: [1] nl_NL.UTF-8/nl_NL.UTF-8/nl_NL.UTF-8/C/nl_NL.UTF-8/nl_NL.UTF-8
attached base packages:
[1] grid stats4 parallel stats graphics grDevices utils datasets methods
[10] base
other attached packages:
[1] VennDiagram_1.6.20 futile.logger_1.4.3 ggalluvial_0.12.3
[4] circlize_0.4.11 ComplexHeatmap_2.7.1.1005 openxlsx_4.2.3
[7] readxl_1.3.1 MAST_1.16.0 SingleCellExperiment_1.12.0
[10] SummarizedExperiment_1.20.0 GenomicRanges_1.42.0 GenomeInfoDb_1.26.2
[13] IRanges_2.24.1 S4Vectors_0.28.1 MatrixGenerics_1.2.0
[16] matrixStats_0.57.0 tidyr_1.1.2 readr_1.4.0
[19] ggplot2_3.3.2 patchwork_1.1.0 Seurat_3.2.3
[22] dplyr_1.0.2 CellChat_0.0.2 Biobase_2.50.0
[25] BiocGenerics_0.36.0
loaded via a namespace (and not attached):
[1] reticulate_1.18 tidyselect_1.1.0 htmlwidgets_1.5.3 Rtsne_0.15
[5] munsell_0.5.0 codetools_0.2-18 ica_1.0-2 future_1.21.0
[9] miniUI_0.1.1.1 withr_2.3.0 colorspace_2.0-0 rstudioapi_0.13
[13] ROCR_1.0-11 tensor_1.5 listenv_0.8.0 NMF_0.23.0
[17] labeling_0.4.2 GenomeInfoDbData_1.2.4 polyclip_1.10-0 farver_2.0.3
[21] coda_0.19-4 parallelly_1.22.0 vctrs_0.3.5 generics_0.1.0
[25] lambda.r_1.2.4 xfun_0.19 R6_2.5.0 doParallel_1.0.16
[29] clue_0.3-58 rsvd_1.0.3 bitops_1.0-6 spatstat.utils_1.17-0
[33] DelayedArray_0.16.0 assertthat_0.2.1 promises_1.1.1 scales_1.1.1
[37] gtable_0.3.0 Cairo_1.5-12.2 globals_0.14.0 goftest_1.2-2
[41] rlang_0.4.9 systemfonts_0.3.2 GlobalOptions_0.1.2 splines_4.0.3
[45] lazyeval_0.2.2 reshape2_1.4.4 abind_1.4-5 httpuv_1.5.4
[49] tools_4.0.3 gridBase_0.4-7 statnet.common_4.4.1 ellipsis_0.3.1
[53] RColorBrewer_1.1-2 ggridges_0.5.2 Rcpp_1.0.5 plyr_1.8.6
[57] zlibbioc_1.36.0 purrr_0.3.4 RCurl_1.98-1.2 rpart_4.1-15
[61] deldir_0.2-3 pbapply_1.4-3 GetoptLong_1.0.4 cowplot_1.1.0
[65] zoo_1.8-8 ggrepel_0.8.2 cluster_2.1.0 tinytex_0.28
[69] magrittr_2.0.1 data.table_1.13.4 RSpectra_0.16-0 futile.options_1.0.1
[73] sna_2.6 lmtest_0.9-38 RANN_2.6.1 fitdistrplus_1.1-3
[77] hms_0.5.3 mime_0.9 xtable_1.8-4 gridExtra_2.3
[81] shape_1.4.5 compiler_4.0.3 tibble_3.0.4 KernSmooth_2.23-18
[85] crayon_1.3.4 htmltools_0.5.0 mgcv_1.8-33 later_1.1.0.1
[89] formatR_1.7 MASS_7.3-53 Matrix_1.2-18 cli_2.2.0
[93] igraph_1.2.6 pkgconfig_2.0.3 registry_0.5-1 plotly_4.9.2.1
[97] foreach_1.5.1 svglite_1.2.3.2 rngtools_1.5 pkgmaker_0.32.2
[101] XVector_0.30.0 stringr_1.4.0 digest_0.6.27 sctransform_0.3.1
[105] RcppAnnoy_0.0.17 rle_0.9.2 spatstat.data_1.5-2 cellranger_1.1.0
[109] leiden_0.3.6 uwot_0.1.9 gdtools_0.2.2 shiny_1.5.0
[113] rjson_0.2.20 lifecycle_0.2.0 nlme_3.1-151 jsonlite_1.7.2
[117] network_1.16.1 viridisLite_0.3.0 limma_3.46.0 fansi_0.4.1
[121] pillar_1.4.7 lattice_0.20-41 fastmap_1.0.1 httr_1.4.2
[125] survival_3.2-7 glue_1.4.2 zip_2.1.1 FNN_1.1.3
[129] spatstat_1.64-1 png_0.1-7 iterators_1.0.13 stringi_1.5.3
[133] irlba_2.3.3 future.apply_1.6.0
This is an inherent issue with CLR normalization for RNA, rather than a mistake in Seurat's implementation. We do not recommend performing CLR normalization on RNA datasets - one reason is because as you say the normalized data is no longer sparse. The normalization strategy is much more effective for CITE-seq data, which is typically non-sparse anyway, and contains a lower number of features.
Dear all,
I noticed when doing CLR transformation on my RNA counts that a sparse matrix is turned into a regular matrix, causing my object to triple in size.
This also gives an error when normalizing my larger datasets.
Would it be possible to change this in the seurat normalizedata function, so CLR is also possible for larger datasets and more memory friendly? Or do you have a solution to do CLR on large single cell datasets and afterwards still analyze it with Seurat?
Thanks! Amber