rezakj / iCellR

Single (i) Cell R package (iCellR) is an interactive R package to work with high-throughput single cell sequencing technologies (i.e scRNA-seq, scVDJ-seq, scATAC-seq, CITE-Seq and Spatial Transcriptomics (ST)).
120 stars 19 forks source link

Error: cannot allocate vector of size 6288.3 Gb #18

Closed jajcobyang closed 4 years ago

jajcobyang commented 4 years ago

Hi there,

I tried to analyze 10x scRNA-seq data by iCellR workflow, however, I failed to load the data on Rstudio server (barcodes.tsv, features.tsv, matrix.mtx). The error popped out is 'Error: cannot allocate vector of size 6288.3 Gb'. I also tried it on my mac, still failed, with error 'Error: vector memory exhausted (limit reached?)'. I tried to change the memory size of my mac, but not working.

Then I use fread function to check the structure of my data and compare the difference between my data and the example data, the only difference is the dimensions of the data frame and my features.tsv. The example genes.tsv only has 2 columns, while mine has three, so I deleted the redundant column. Still, I failed to load the data with the same error.

  1. season info of the Rstudio server `R version 3.6.3 (2020-02-29) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.4 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1 LAPACK: /home/fany/.local/share/r-miniconda/envs/r-reticulate/lib/libmkl_rt.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] Matrix_1.2-18 readr_1.3.1 data.table_1.12.8 SeuratData_0.2.1 patchwork_1.0.1
[6] stringr_1.4.0 dplyr_1.0.0 Seurat_3.1.5 iCellR_1.5.1 plotly_4.9.2.1
[11] ggplot2_3.3.2

loaded via a namespace (and not attached): [1] Rtsne_0.15 colorspace_1.4-1 ggsignif_0.6.0 ellipsis_0.3.1 rio_0.5.16
[6] ggridges_0.5.2 htmlTable_2.0.0 base64enc_0.1-3 ggdendro_0.1-20 rstudioapi_0.11
[11] leiden_0.3.3 ggpubr_0.4.0 listenv_0.8.0 ggrepel_0.8.2 bit64_0.9-7
[16] fansi_0.4.1 codetools_0.2-16 splines_3.6.3 knitr_1.29 Formula_1.2-3
[21] jsonlite_1.7.0 ica_1.0-2 broom_0.5.6 cluster_2.1.0 png_0.1-7
[26] pheatmap_1.0.12 uwot_0.1.8 sctransform_0.2.1 shiny_1.5.0 compiler_3.6.3
[31] httr_1.4.1 backports_1.1.8 assertthat_0.2.1 fastmap_1.0.1 lazyeval_0.2.2
[36] cli_2.0.2 later_1.1.0.1 acepack_1.4.1 htmltools_0.5.0 prettyunits_1.1.1
[41] tools_3.6.3 rsvd_1.0.3 igraph_1.2.5 gtable_0.3.0 glue_1.4.1
[46] reshape2_1.4.4 RANN_2.6.1 rappdirs_0.3.1 Rcpp_1.0.4.6 carData_3.0-4
[51] cellranger_1.1.0 vctrs_0.3.1 ape_5.4 nlme_3.1-144 lmtest_0.9-37
[56] xfun_0.15 globals_0.12.5 openxlsx_4.1.5 irlba_2.3.3 mime_0.9
[61] lifecycle_0.2.0 rstatix_0.6.0 future_1.17.0 zoo_1.8-8 MASS_7.3-51.5
[66] scales_1.1.1 hms_0.5.3 promises_1.1.1 parallel_3.6.3 NbClust_3.0
[71] RColorBrewer_1.1-2 curl_4.3 pbapply_1.4-2 reticulate_1.16 gridExtra_2.3
[76] rpart_4.1-15 reshape_0.8.8 latticeExtra_0.6-29 stringi_1.4.6 checkmate_2.0.0
[81] zip_2.0.4 rlang_0.4.6 pkgconfig_2.0.3 lattice_0.20-40 ROCR_1.0-11
[86] purrr_0.3.4 htmlwidgets_1.5.1 cowplot_1.0.0 bit_1.1-15.2 tidyselect_1.1.0
[91] RcppAnnoy_0.0.16 plyr_1.8.6 magrittr_1.5 R6_2.4.1 generics_0.0.2
[96] Hmisc_4.4-0 pillar_1.4.4 haven_2.3.1 foreign_0.8-75 withr_2.2.0
[101] fitdistrplus_1.1-1 survival_3.1-11 scatterplot3d_0.3-41 abind_1.4-5 nnet_7.3-13
[106] tsne_0.1-3 tibble_3.0.1 future.apply_1.5.0 crayon_1.3.4 car_3.0-8
[111] hdf5r_1.3.2 KernSmooth_2.23-16 jpeg_0.1-8.1 progress_1.2.2 grid_3.6.3
[116] readxl_1.3.1 forcats_0.5.0 digest_0.6.25 xtable_1.8-4 tidyr_1.1.0
[121] httpuv_1.5.4 munsell_0.5.0 viridisLite_0.3.0 `

  1. season info of my mac `R version 3.6.3 (2020-02-29) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Catalina 10.15.2

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] iCellR_1.5.4 plotly_4.9.2.1 ggplot2_3.3.2

loaded via a namespace (and not attached): [1] nlme_3.1-148 bit64_0.9-7 RColorBrewer_1.1-2 progress_1.2.2
[5] httr_1.4.1 tools_3.6.3 backports_1.1.8 R6_2.4.1
[9] rpart_4.1-15 Hmisc_4.4-0 uwot_0.1.8 lazyeval_0.2.2
[13] colorspace_1.4-1 nnet_7.3-14 withr_2.2.0 tidyselect_1.1.0
[17] gridExtra_2.3 prettyunits_1.1.1 bit_1.1-15.2 curl_4.3
[21] compiler_3.6.3 htmlTable_2.0.0 hdf5r_1.3.2 ggdendro_0.1-20
[25] scales_1.1.1 checkmate_2.0.0 stringr_1.4.0 digest_0.6.25
[29] foreign_0.8-75 rmarkdown_2.3 rio_0.5.16 base64enc_0.1-3
[33] jpeg_0.1-8.1 pkgconfig_2.0.3 htmltools_0.5.0 fastmap_1.0.1
[37] readxl_1.3.1 htmlwidgets_1.5.1 rlang_0.4.6 rstudioapi_0.11
[41] shiny_1.5.0 generics_0.0.2 jsonlite_1.7.0 acepack_1.4.1
[45] dplyr_1.0.0 zip_2.0.4 car_3.0-8 magrittr_1.5
[49] Formula_1.2-3 NbClust_3.0 Matrix_1.2-18 Rcpp_1.0.4.6
[53] munsell_0.5.0 abind_1.4-5 ape_5.4 lifecycle_0.2.0
[57] scatterplot3d_0.3-41 stringi_1.4.6 yaml_2.2.1 carData_3.0-4
[61] MASS_7.3-51.6 Rtsne_0.15 plyr_1.8.6 grid_3.6.3
[65] parallel_3.6.3 promises_1.1.1 ggrepel_0.8.2 forcats_0.5.0
[69] crayon_1.3.4 lattice_0.20-41 haven_2.3.1 splines_3.6.3
[73] hms_0.5.3 knitr_1.29 pillar_1.4.4 igraph_1.2.5
[77] ggpubr_0.4.0 ggsignif_0.6.0 glue_1.4.1 evaluate_0.14
[81] latticeExtra_0.6-29 data.table_1.12.8 png_0.1-7 vctrs_0.3.1
[85] httpuv_1.5.4 cellranger_1.1.0 gtable_0.3.0 RANN_2.6.1
[89] purrr_0.3.4 tidyr_1.1.0 reshape_0.8.8 xfun_0.15
[93] openxlsx_4.1.5 mime_0.9 xtable_1.8-4 broom_0.5.6
[97] rstatix_0.6.0 later_1.1.0.1 survival_3.2-3 viridisLite_0.3.0
[101] tibble_3.0.1 pheatmap_1.0.12 cluster_2.1.0 ellipsis_0.3.1 `

  1. the screenprint image

image

image

I'm really puzzled. Can you kindly help me?

Best.

Fan

rezakj commented 4 years ago

Your barcodes file is very large (about half a Gb), when we looked at your files you had about 20 million cells. This is very big to work with. The best way is to read the file using readMM function from Matrix R package and filter the data to have it around for example 100 K cells. Large samples are hard to work with in R anyway. R is a bit memory intensive.