Closed Griffan closed 5 years ago
Hi, it might be related to the warning: "RuntimeWarning: divide by zero encountered in log". Before that you can try a few diagnoses: 1) change to single CPU mode by setting -p 1, see if you see the same error. 2) try it in Python 3 environment, which seems not likely the case though
Alternatively, I wonder which species you are working on. If human, I would normally pile up a list candidate SNPs, e.g., from 1000 genome project, to avoid potential RNA editing variants contamination.
I tried all of these suggestions, none of them helped. Would you mind to explain a little bit what this part of code exactly do?
one possibility is that your reads have some bases with lowest quality, hence Q=0: https://support.illumina.com/help/BaseSpace_OLH_009008/Content/Source/Informatics/BS/QualityScoreEncoding_swBS.htm
I'm not sure if this is the case? can you provide more details on the logs? is this failed from the begining or it runs a while and failed in the middle?
The log I provided in the very beginning is all the information it outputs. If Q=0 is the case, is it possible to release a hot-fix to defend this situation?
OK, I have added a minQ
to avoid 0 value:
https://github.com/huangyh09/cellSNP/blob/master/cellSNP/utils/pileup_utils.py#L100
Could you re-install from this repo (not on pypi yet), and see if it works? You may need to pip uninstall cellSNP
first.
I double checked the code, after pip uninstall and pip install, the line you modified didn't sync with your branch. I manually added this line, I will try again.
After trying various ways to update and confirm that the code has included this patch, cellSNP still complains the very beginning message:
[cellSNP] mode 1: fetch given SNPs in 23111 single cells. [cellSNP] loading the VCF file for given SNPs ... [cellSNP] fetching 36312488 candidate variants ... Traceback (most recent call last): File "/net/1000g/fanzhang/bin/miniconda2/bin/cellSNP", line 9, in
load_entry_point('cellSNP==0.1.4', 'console_scripts', 'cellSNP')() File "/net/1000g/fanzhang/bin/miniconda2/lib/python2.7/site-packages/cellSNP-0.1.4-py2.7.egg/cellSNP/cellSNP.py", line 209, in main min_MAPQ, max_FLAG, min_LEN, doubletGL, True) File "/net/1000g/fanzhang/bin/miniconda2/lib/python2.7/site-packages/cellSNP-0.1.4-py2.7.egg/cellSNP/utils/pileup_utils.py", line 240, in fetch_positions qual_list, cell_list, UMIs_list, barcodes) File "/net/1000g/fanzhang/bin/miniconda2/lib/python2.7/site-packages/cellSNP-0.1.4-py2.7.egg/cellSNP/utils/pileup_utils.py", line 309, in map_barcodes qual_cells[_idx][BASE_IDX[_base]] += qual_vector(_qual) File "/net/1000g/fanzhang/bin/miniconda2/lib/python2.7/site-packages/cellSNP-0.1.4-py2.7.egg/cellSNP/utils/pileup_utils.py", line 101, in qual_vector RV = [np.log(1-Q), np.log(3/4 - 2/3Q), np.log(1/2 - 1/3Q), np.log(Q)] RuntimeWarning: divide by zero encountered in log
I think this means Q also should be banned from more values. is it possible to explain how do these variables map to your paper formula? Thanks!
That's very strange. I have given a lower bound 0.00001. This is a developing feature and has been used for the donor deconvolution.
In principle, you can turn off some of these codes for diagnosis. Alternatively, you can send me a few reads in your bam file, and I may unsterstand this issue better. Thanks.
I mean it is not necessarily because of the term np.log(Q), I think np.log(1-Q), np.log(3/4 - 2/3Q), np.log(1/2 - 1/3Q) also potentially could cause the problem, especially log(1-Q), because base quality 1 is also very common.
Hello, I encountered this error, what could be the reason?