kharchenkolab / numbat

Haplotype-aware CNV analysis from single-cell RNA-seq
https://kharchenkolab.github.io/numbat/
Other
169 stars 23 forks source link

Error Running HMMs running Numbat #121

Open lopezCascales opened 1 year ago

lopezCascales commented 1 year ago

I have a scRNAseq samples SORTseq protocol, well plate based, I have this error when I do the run_numbat, my script only gets to do the hierarchical clustering, but then it stops, I have tried with different parameters, but I can't find the solution. I have Smartseq sequencing data. Im trying to study the phylogeny of my clones. I don't know what I'm doing wrong.

head(mat2) head(mat2) 6 x 362 sparse Matrix of class "dgCMatrix" [[ suppressing 362 column names ‘s143.CB_AAAGCGGA.bam’, ‘s143.CB_AAAGGCTG.bam’, ‘s143.CB_AACATGGG.bam’ ... ]]

A1BG . . . . . . . . . A2M-AS1 . . . . . . . . . A2ML1 1.000000 1.000000 1.000000 . 3.000733 . 1.000000 2.000244 1 A4GALT 1.000000 2.000244 . . 5.002443 1.000000 8.006845 2.000244 . AAAS 7.005133 1.000000 2.000244 . 1.000000 1.000000 1.000000 2.000244 . AACS 2.000244 . . 1 1.000000 2.000244 2.000244 3.000733 1

head(ref2)

        BP0    TP1    TP2

A1BG 3.892958e-07 1.105179e-06 1.818631e-06 A2M-AS1 0.000000e+00 3.683929e-07 0.000000e+00 A2ML1 3.128015e-05 4.092058e-05 4.330387e-05 A4GALT 3.374699e-05 2.948070e-05 3.638907e-05 AAAS 5.296247e-05 4.902052e-05 2.655832e-05 AACS 3.257698e-05 2.247669e-05 3.019870e-05

head(df_allele) cell snp_id DP AD CHROM.x POS.x REF.x ALT.x GT.x cM.x 1 s143.CB_AAAGGCTG.bam 1_629218_A_G 5 0 1 629218 A G 1|0 0 2 s143.CB_AACTCTGG.bam 1_629218_A_G 2 0 1 629218 A G 1|0 0 3 s143.CB_AAGCACAT.bam 1_629218_A_G 2 0 1 629218 A G 1|0 0 4 s143.CB_AAGTGGCT.bam 1_629218_A_G 1 0 1 629218 A G 1|0 0 5 s143.CB_AATCATGC.bam 1_629218_A_G 1 0 1 629218 A G 1|0 0 6 s143.CB_ACACCGTG.bam 1_629218_A_G 4 0 1 629218 A G 1|0 0 CHROM.y POS.y REF.y ALT.y GT.y cM.y CHROM POS REF ALT GT cM 1 1 629218 A G 1|0 0 1 629218 A G 1|0 0 2 1 629218 A G 1|0 0 1 629218 A G 1|0 0 3 1 629218 A G 1|0 0 1 629218 A G 1|0 0 4 1 629218 A G 1|0 0 1 629218 A G 1|0 0 5 1 629218 A G 1|0 0 1 629218 A G 1|0 0 6 1 629218 A G 1|0 0 1 629218 A G 1|0 0

out2= run_numbat(mat2, ref2, df_allele, genome="hg38", gamma= 1, t=1e-5, ncores=16, ncores_nni=16, plot=TRUE,min_LLR=30, skip_nj=T, nu=0, out_dir='./test2', n_cut = 3, max_entropy=0.8)

Numbat version: 1.3.1 Running under parameters: t = 1e-05 alpha = 1e-04 gamma = 1 min_cells = 50 init_k = 3 max_cost = 108.6 n_cut = 3 max_iter = 2 max_nni = 100 min_depth = 0 use_loh = auto segs_loh = None call_clonal_loh = FALSE segs_consensus_fix = None multi_allelic = TRUE min_LLR = 30 min_overlap = 0.45 max_entropy = 0.8 skip_nj = TRUE diploid_chroms = None ncores = 16 ncores_nni = 16 common_diploid = TRUE tau = 0.3 check_convergence = FALSE plot = TRUE genome = hg38 Input metrics: 362 cells Mem used: 1.12Gb Approximating initial clusters using smoothed expression .. Mem used: 1.12Gb number of genes left: 9867 running hclust... Iteration 1 Mem used: 1.12Gb High SNP contamination detected (71.5%). Please make sure that cells from only one individual are included in genotyping step. Expression noise level (MSE): low (0.094). Running HMMs on 4 cell groups.. Error in mutate(., state = run_joint_hmm(pAD = pAD, DP = DP, p_s = p_s, : ℹ In argument: state = run_joint_hmm(...). ℹ In group 1: CHROM = 1. Caused by error in h(): ! error in evaluating the argument 'x' in selecting a method for function 'rowSums': dpoilog: all x must be integers

Error in group_by(): ! Must group by variables found in .data. Column seg is not found. Column sample is not found. Run rlang::last_trace() to see where the error occurred. Warning message: In mclapply(bulks %>% split(.$sample), mc.cores = ncores, function(bulk) { : all scheduled cores encountered errors in user code ################################################################################### If you can think of a solution and explanation it would be of great help, thank you very much in advance. Mayte

teng-gao commented 1 year ago

Hello,

Same problem as #112 - your count matrix should be raw integer counts.