velocyto-team / velocyto.R

RNA velocity estimation in R
http://velocyto.org
179 stars 223 forks source link

RunVelocity() requested size is too large; suggest to enable ARMA_64BIT_WORD #188

Open Totoro-chen opened 2 years ago

Totoro-chen commented 2 years ago

Hello,

I'm having an issue with RunVelocity failing on a 2048GB highmem machine (originally tried on 256GB). This seems similar to https://github.com/satijalab/seurat-wrappers/issues/21 and#116 . Any advice would be much appreciated! Thanks!

> gc()
          used  (Mb) gc trigger   (Mb)  max used   (Mb)
Ncells 3956621 211.4    7592898  405.6   6302854  336.7
Vcells 7254915  55.4  394153882 3007.2 637698475 4865.3
>velo <- RunVelocity(object = velo, deltaT = 1, kCells = 25, fit.quantile = 0.02, spliced.average = 0.2, unspliced.average = 0.05)
Filtering genes in the spliced matrix
Filtering genes in the unspliced matrix
Calculating embedding distance matrix
Error in arma_mat_cor(mat) : 
  Mat::init(): requested size is too large; suggest to enable ARMA_64BIT_WORD
> dim(velo)
[1] 22150 90119
> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /public/software/R/lib64/R/lib/libRblas.so
LAPACK: /public/software/R/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] loomR_0.2.1.9000      hdf5r_1.3.5           R6_2.5.1              velocyto.R_0.6        Matrix_1.4-1         
 [6] forcats_0.5.1         stringr_1.4.0         dplyr_1.0.9           purrr_0.3.4           readr_2.1.2          
[11] tidyr_1.2.0           tibble_3.1.7          ggplot2_3.3.6         tidyverse_1.3.1       SeuratWrappers_0.3.0 
[16] sp_1.5-0              SeuratObject_4.1.0    Seurat_4.1.1          SeuratDisk_0.0.0.9020

loaded via a namespace (and not attached):
  [1] readxl_1.4.0          backports_1.4.1       plyr_1.8.7            igraph_1.3.2          lazyeval_0.2.2       
  [6] splines_4.1.3         listenv_0.8.0         scattermore_0.8       digest_0.6.29         htmltools_0.5.2      
 [11] fansi_1.0.3           magrittr_2.0.3        tensor_1.5            cluster_2.1.3         ROCR_1.0-11          
 [16] tzdb_0.3.0            remotes_2.4.2         globals_0.15.1        modelr_0.1.8          matrixStats_0.62.0   
 [21] R.utils_2.12.0        spatstat.sparse_2.1-1 colorspace_2.0-3      rvest_1.0.2           ggrepel_0.9.1        
 [26] haven_2.5.0           crayon_1.5.1          jsonlite_1.8.0        progressr_0.10.1      spatstat.data_2.2-0  
 [31] survival_3.3-1        zoo_1.8-10            glue_1.6.2            polyclip_1.10-0       gtable_0.3.0         
 [36] leiden_0.4.2          future.apply_1.9.0    BiocGenerics_0.40.0   abind_1.4-5           scales_1.2.0         
 [41] DBI_1.1.3             spatstat.random_2.2-0 miniUI_0.1.1.1        Rcpp_1.0.8.3          viridisLite_0.4.0    
 [46] xtable_1.8-4          reticulate_1.25       spatstat.core_2.4-4   bit_4.0.4             rsvd_1.0.5           
 [51] htmlwidgets_1.5.4     httr_1.4.3            RColorBrewer_1.1-3    ellipsis_0.3.2        ica_1.0-2            
 [56] farver_2.1.0          pkgconfig_2.0.3       R.methodsS3_1.8.2     uwot_0.1.11           dbplyr_2.2.1         
 [61] deldir_1.0-6          utf8_1.2.2            tidyselect_1.1.2      rlang_1.0.3           reshape2_1.4.4       
 [66] later_1.3.0           munsell_0.5.0         cellranger_1.1.0      tools_4.1.3           cli_3.3.0            
 [71] generics_0.1.3        broom_1.0.0           ggridges_0.5.3        fastmap_1.1.0         goftest_1.2-3        
 [76] bit64_4.0.5           fs_1.5.2              fitdistrplus_1.1-8    RANN_2.6.1            pbapply_1.5-0        
 [81] future_1.26.1         nlme_3.1-158          mime_0.12             R.oo_1.25.0           xml2_1.3.3           
 [86] compiler_4.1.3        rstudioapi_0.13       plotly_4.10.0         png_0.1-7             spatstat.utils_2.3-1 
 [91] reprex_2.0.1          stringi_1.7.6         rgeos_0.5-9           lattice_0.20-45       vctrs_0.4.1          
 [96] pillar_1.7.0          lifecycle_1.0.1       BiocManager_1.30.18   spatstat.geom_2.4-0   lmtest_0.9-40        
[101] RcppAnnoy_0.0.19      data.table_1.14.2     cowplot_1.1.1         irlba_2.3.5           httpuv_1.6.5         
[106] patchwork_1.1.1       pcaMethods_1.86.0     promises_1.2.0.1      KernSmooth_2.23-20    gridExtra_2.3        
[111] parallelly_1.32.0     codetools_0.2-18      MASS_7.3-57           assertthat_0.2.1      withr_2.5.0          
[116] sctransform_0.3.3     mgcv_1.8-40           parallel_4.1.3        hms_1.1.1             grid_4.1.3           
[121] rpart_4.1.16          Rtsne_0.16            Biobase_2.54.0        shiny_1.7.1           lubridate_1.8.0  

In addition, I checked the original code of the function and found the error. Is it related to the multi-threaded use of armaCor function? The number of threads used does not appear to be specified in the armaCor function.Any advice would be useful!Thanks!

if (verbose) {
    [message](https://rdrr.io/r/base/message.html)("Calculating embedding distance matrix")
  }
  cell.dist <- [as.dist](https://rdrr.io/r/stats/dist.html)(
    m = 1 - velocyto.R::armaCor(
      mat = [t](https://rdrr.io/r/base/t.html)(x = Embeddings(object = object, reduction = reduction))
    )
  )