kharchenkolab / numbat

Haplotype-aware CNV analysis from single-cell RNA-seq
https://kharchenkolab.github.io/numbat/
Other
156 stars 22 forks source link

memory usage error during run_numbat #171

Open sxf-ux opened 3 months ago

sxf-ux commented 3 months ago

Hi! Im trying to run run_numbat but my job keeps getting killed. Here's the contents of one of my runs:

Running under parameters:
t = 1e-05
alpha = 1e-04
gamma = 20
min_cells = 50
init_k = 3
max_cost = 5642.4
max_iter = 2
max_nni = 100
min_depth = 0
use_loh = auto
multi_allelic = TRUE
min_LLR = 5
min_overlap = 0.45
max_entropy = 0.5
skip_nj = FALSE
diploid_chroms =
ncores = 4
ncores_nni = 4
common_diploid = TRUE
tau = 0.3
check_convergence = FALSE
plot = TRUE
genome = hg38
Input metrics:
18808 cells
INFO [2024-02-27 16:54:34] Mem used: 9.8Gb
INFO [2024-02-27 16:54:59] Approximating initial clusters using smoothed expression ..
INFO [2024-02-27 16:55:01] Mem used: 9.8Gb
INFO [2024-02-27 20:49:44] running hclust...
INFO [2024-02-27 21:03:19] Iteration 1
INFO [2024-02-27 21:03:24] Mem used: 26.4Gb
INFO [2024-02-27 21:03:45] Running HMMs on 5 cell groups..
INFO [2024-02-27 21:04:10] quadruploid state enabled
INFO [2024-02-27 21:04:10] diploid regions: 3a,6a,7a,11a,12a,12c,12e,14a,16a,17a,19a,19c,20a,21a,22a
INFO [2024-02-27 21:05:42] Expression noise level: medium (0.93).
INFO [2024-02-27 21:07:29] Running HMMs on 3 cell groups..
INFO [2024-02-27 21:07:42] quadruploid state enabled
INFO [2024-02-27 21:07:42] diploid regions: 3a,6a,11a,12a,12c,12e,14a,16a,17a,19a,20a,21a,22a
INFO [2024-02-27 21:09:16] Testing for multi-allelic CNVs ..
INFO [2024-02-27 21:09:16] 0 multi-allelic CNVs found:
INFO [2024-02-27 21:09:16] Evaluating CNV per cell ..
INFO [2024-02-27 21:09:18] Mem used: 13Gb

Once it starts to evaluate CNVs, the job gets killed. im running this on a server that should theoretically have enough memory to run this but do you have any recommendations on how to circumvent this ? I used the following parameters to run this:

out = run_numbat(
  count_mat, # gene x cell integer UMI count matrix 
  ref_hca, # reference expression profile, a gene x cell type normalized expression level matrix
  df_allele, # allele dataframe generated by pileup_and_phase script
  genome = "hg38",
  t = 1e-5,
  ncores = 4,
  skip_nj = FALSE
  plot = TRUE,
  out_dir = './numbat/test_2'
)

thank you in advance.