lindenb / jvarkit

Java utilities for Bioinformatics
https://jvarkit.readthedocs.io/
Other
482 stars 133 forks source link

how to get the number of mismatches per read with the help of sam2tsv ? #63

Closed WinterLi1993 closed 8 years ago

WinterLi1993 commented 8 years ago

HI, I am using your tool ,and I get the output like this #READ_NAME FLAG CHROM READ_POS BASE QUAL REF_POS REF OP NJv8086101:4:0 163 1 0 T ? 10001 T M NJv8086101:4:0 163 1 1 A < 10002 A M NJv8086101:4:0 163 1 2 A < 10003 A M NJv8086101:4:0 163 1 3 C ? 10004 C M NJv8086101:4:0 163 1 4 C @ 10005 C M NJv8086101:4:0 163 1 5 C A 10006 C M NJv8086101:4:0 163 1 6 T F 10007 T M NJv8086101:4:0 163 1 7 A ? 10008 A M NJv8086101:4:0 163 1 8 A @ 10009 A M NJv8086101:4:0 163 1 9 C C 10010 C M NJv8086101:4:0 163 1 10 C A 10011 C M NJv8086101:4:0 163 1 11 C A 10012 C M NJv8086101:4:0 163 1 12 T D 10013 T M NJv8086101:4:0 163 1 13 A > 10014 A M NJv8086101:4:0 163 1 14 A > 10015 A M NJv8086101:4:0 163 1 15 C A 10016 C M

I wanna know how to count the number of mismatches for every read from your output result ?

lindenb commented 8 years ago

use awk to find differences between REF and BASE.