isovic / graphmap

GraphMap - A highly sensitive and accurate mapper for long, error-prone reads http://www.nature.com/ncomms/2016/160415/ncomms11307/full/ncomms11307.html Note: This was the original repository which will no longer be officially maintained. Please use the new official repository here:
https://github.com/lbcb-sci/graphmap2
MIT License
178 stars 44 forks source link

CIGAR and MD field mismatch #34

Closed ParithiB closed 7 years ago

ParithiB commented 7 years ago

Hi Ivan,

I am running graphmap using nanopore ecoli reads and encountered a problem as CIGAR sequence and MD field don't match.

Sum of the number of Deletion, Mismatch and Equal match in CIGAR should match up with the sum of integers and number of [A, C, T, G] in MD flag (ignoring the ‘^’ and 0).

But I get a mismatch between them for few files (but most work fine), in the zip file mentioned below are two sample files nanopore_reads.zip

I used poretools( 0.5.1)to get the fastq files from them.[ecoli_k12.fa.txt]

Reference: ecoli_k12.fa (https://github.com/isovic/graphmap/files/448448/ecoli_k12.fa.txt) Query: ./bin/Linux-x64/graphmap align --extcigar -r ecoli_k12.fa -d fastq_file -o output.sam

Here is the code i used to check that . CIGAR_MD.py.zip

Thanks Parithi

isovic commented 7 years ago

Hi Parithi! Thank you so much for reporting this, this was a very important bug which was unnoticed until your Issue. Also, the data was very helpful! I reimplemented this part of the code and tested it, it should be fine now. Try git pull; make modules; make. Sorry for the long delay, it was an extremely busy period for me. Best regards, Ivan.