raphael-group / chisel

CHISEL -- Copy-number Haplotype Inference in Single-cell by Evolutionary Links
BSD 3-Clause "New" or "Revised" License
37 stars 11 forks source link

another: AssertionError: There is a bin with a BAF shift > 0.5, likely BAF was not mirrored between 0 and 0.5 #20

Closed rLannes closed 3 years ago

rLannes commented 3 years ago

Hi, Thank you for having made such a package!

I am struggling to make it work properly. I have seen two posts related to the error I get, but they do not seem related to my issues I am launching it from the running directory, I do not have an autosomal chromosome in my phased data.

results of cat */log > chisel.log: chisel.log results of sort -rk10 combo/combo.tsv > combo_file_head.tsv : combo_file_head.txt results of head baf/baf.tsv > head_baf.txt : head_baf.txt

In the baf file there are only 3 chromosomes chr6 chr11 and chr14.

Could you help me? My data come from 10X data.

Best regards, Romain

simozacca commented 3 years ago

Thanks for the interest in CHISEL, I would be happy to help you with this issue.

First, CHISEL currently requires you to provide SNPs for all the chromosomes that you are asking to consider. As such, if your SNPs are only covering some chromosomes then you should run CHISEL by specifying to consider those chromosomes only. In your case, if SNPs are only provided in chromosomes chr6, chr11, and chr14, then you should add the related flag -c "chr6 chr11 chr14" to the chisel command as specified in the corresponding guide.

Second, your log message is suggesting that you only provided 4264 phased SNP positions, which are significantly less than what expected for human whole genome sequencing data (the expected minimum number of heterozygous SNPs is at least 350 times more than this). This number is suggesting something is not right. Could you please confirm that your data are for single-cell whole-genome DNA sequencing? Or are you running data from different sequencing technologies, like scRNA?

rLannes commented 3 years ago

Okay,

You are very right. Reference data are amplicon sequencing not whole genome. I am closing this issues because it is data related. Thank you for your quick answer.

Best regards,