Closed Pitithat-pu closed 6 years ago
Thanks for the bug report
I'm currently looking at this. To make my debug faster, can you please give me the name of ONE read that shouldn't appear here. I'm away from my lab and I don't have a REF genome here. Thanks.
Ah! unless I did not understand the original question https://www.biostars.org/p/322664/.
. I want to extract only reads with their mate that support variant allele in the vcf
What is your query ?
1) if the read AND it's mate have the variant, dump the pair 2) if the read OR it's mate have the variant, dump the pair (what happens if one read overlap two SNPs while the mate overlaps only one ?)
I want both if the read AND it's mate have the variant, dump the pair. It's possible because some DNA fragments are short so that a pair of read can overlap over variant position.
if the read OR it's mate have the variant, dump the pair. Yes dump the pair.
what happens if one read overlap two SNPs while the mate overlaps only one? Also dump the pair
Sorry, I'm also away from my lab. I can't give the a read name that shouldn't appear. The reads should not appear are the read with REF base at variant position. Thanks
I've quickly worked on this without being able to identify a read failing the criteria. I've added a --pair
option where the read and the mate must both carry the mutation.
I'm waiting for your input to give me the name of a read. Thanks.
I played with your data this morning and my latest version:
This is the output of 'samtools tview before filtering'
43354131 43354141 43354151 43354161 43354171 43354181 43354191 43354201
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
g gtgccgggaggagcgcaccaagtctgcgagcaggggccggccaatggtgaggctgggaatgctggccaggacgcagagtg
gc TGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
gct TGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCCATGGAGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
GCTGTGCC GGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
GCTGTGCCGGGAGGA cgcaccaagtctgcgagcaggggccggccaatggtgaggctgggaatgctggccaggacgcagagtg
GCTGTGCCGGGAGGA CACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
gctgtgccgggaggagc accaagtctgcgagcaggggccggccaatggtgaggctgggaatgctggccaggacgcagagtg
gctgtgccgggaggagcgca gtctgcgagcaggggccggccaatggtgaggctgggaatgctggccaggacgcagagtg
gctgtgccgggaggagcgcacc ctgcgagcaggggccggccaatggtgaggctgggaatgctggccaggacgcagagtg
gctgtgccgggaggagcgcacc CGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
GCTGTGCCGGGAGGAGCGCACCAA CGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
GCTGTGCCGGGAGGAGCGCACCAAG CGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
GCTGTGCCGGGAGGAGCGCACCAAG CGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGC agtg
GCTGTGCCGGGAGGAGCGCACCAAGTCTGC GGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
GCTGTGCCGGGAGGAGCGCACCAAGTCTGC GGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
gctgtgccgggaggagcgcac AGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
gctgtgccgggaggagcgcaccaagtctgcg ggccggccaatggtgaggctgggaatgctggccaggacgcagagtg
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAG CGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
gctgtgccgggaggagcgcaccaagtctgcgagc cggccaatggtgaggctgggaatgctggccaggacgcagagtg
gctgtgccgggaggagcgcaccaagtctgcgagc ggccaatggtgaggctgggaatgctggccaggacgcagagtg
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGG ggccaatggtgaggctgggaatgctggccaggacgcagagtg
gctgtgccgggaggagcgcaccaagtctgcgagcaggg GCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCA tgaggctgggaatgctggccaggacgcagagtg
GCTGTGCCGGGAGGAGCGCACCAAG gccaatggtgaggctgggaatgctggccaggacgcagagtg
gctg GCTGGGAGAAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
gctgtgccgggaggagcgcaccaagtctgcgagcaggggccggccaatgg gggaatgctggccaggacgcagagtg
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGT GAATGCTGGCCAGGACGCAGAGTG
GCTGTGCC ggagcgcaccaagtctgcgagcaggggccggccaatggtgaggctgggaatgctggccaggacgcagagtg
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGT GAATGCTGGCCAGGACGCAGAGTG
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGT GCTGGCCAGGACGCAGAGGG
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGG GGCCAGGACGCAGAGTG
gctgtgccgggaggagcgcaccaagtctgcgagcaggggccggccaatggtgagg GGCCAGGACGCAGAGTG
gctgtgccgggaggagcgcaccaagtctgcgagcaggggccggccaatggtgaggc GGCCAGGACGCAGAGTG
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCT GCCAGGACGCAGAGTG
GCTGTGCTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCT gccaggacgcagagtg
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGG CCAGGACGCAGAGTG
GCTGTGCCGGGA gagcgcaccaagtctgcgagcaggggccggccaatggtgaggctgggaatgctggccaggacgcagagtg
gctgtgccgggaggagcgcaccaagtctgcgagcaggggccggccaatggtgaggctggg aggacgcagagtg
gctgtgccgggaggagcgcaccaagtctgcgagcaggggccggccaatggtg aggacgcagagtg
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAA aggacgcagagtg
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGC GGACGCAGAGTG
TCTGTGCTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGAC CAGAGTG
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGAC cagagtg
cctgtgccgggaggagcgcaccaagtctgcgagcaggggccggccaatggtgaggctgggaattctggccaggacg g
gctgtgccgggaggagcgcaccaagtctgcgagcaggggccggccaatggtgaggctgggaatgctggccaggacgcagag g
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAG g
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGAC cagagtg
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
GCTGTGCTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCA gaggctgggaatgctggccaggacgcagagtg
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
GCTGTGCCGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTG
then
$ java -jar dist/biostar322664.jar -V minivcf.vcf.gz minisam.query.bam | samtools sort -T tmp -O bam -o minisam.biostar.bam - && samtools index minisam.biostar.bam
tview
43354131 43354141 43354151 43354161 43354171 43354181 43354191 43354201
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
GKCTGTGCTGGGAGRAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGT
GGCTGTGCTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCT
GTCTGTGCTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGAC
GGCTGTGCTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGT
ggctgtgctgggaggagcgcaccaagtctgcgagcaggggccggccaatggtgaggctgggaatgctggccaggacgcagagt
GCTGGGAGAAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGT
ctgggaggagcgcaccaagtctgcgagcaggggccggccaatggtgaggctgggaatgctggccaggacgcagagt
as far as I can see below, I expected 6 rows of reads + the consensus line for the allele 'T'
$ java -jar dist/sam2tsv.jar minisam.coord.bam | grep 43354136 | cut -f 5 | sort | uniq -c
58 C
6 T
Thank you, now your program can find reads that support variant allele. but still don't report the read mate. For example the read name SRR5229653.1238179 which the forward read support the variant allele. the reverse read SRR5229653.1238179 should be dumped as well. So in total there are 5 pairs of read to report from this minisam SRR5229653.1238236 SRR5229653.1238161 SRR5229653.1238179 SRR5229653.1238155 SRR5229653.1238185
ps. the actual objective is to find the DNA fragments at support the variant allele
Thanks a lot
Ah I see, it was a problem with the way I was comparing the read names. I hope it's fixed in https://github.com/lindenb/jvarkit/commit/8e93bba47ac192eb96e3c11dd661ac82138e3147
A test :
$ make biostar322664 && java -jar dist/biostar322664.jar -V minivcf.vcf.gz minisam.query.bam -X c | samtools sort -T tmp -O bam -o minisam.biostar.bam - && samtools view minisam.biostar.bam | grep -E 'SRR5229653\.(1238236|1238161|1238179|1238155|1238185)' | sort -t $'\t' -k1,1
SRR5229653.1238155 163 1 43353943 60 100M = 43354127 284 CTTCCCGCTCTGGGTTCGGCTCTTCTCTCGCAGGCCGCGTTTCTCAGCCAGGCTTAGGGGAATCCCTCGAAGCACGTGGTCCCGCTGCGCCACAGCCAGG @B=BBB:CDACCBA>AA;ABBBB@CBBBA;AABBBA5C;>AACCC@DCAAC@CB@@DCCCA?@CBBCE=>BEEBC=@BD@DDD=EDEF>FECDCEEB??@ MC:Z:100M BD:Z:JJKLQMMPPLLNMJMLLMLNOKKKLKKKKMNOONNNKNLNLCLKKMOPNNONNOKKNNJJLLKOMJMKMOLLPOJLNJMNPNKLOPOOMOPPLMRTNNON MD:Z:100 PG:Z:MarkDuplicates RG:Z:run_CLL004-P6 BI:Z:NNNNPLNPRMOPNKPMMNOPQLNMMNLNLNOPPNPNMONPMEMNLOPQNOPNPQMNONKKOMNONKPLNOMNQPNOQNOQQOLNPRQRPQPRPPTUOPQO NM:i:0 MQ:i:60 AS:i:100 XS:i:37
SRR5229653.1238155 83 1 43354127 60 100M = 43353943 -284 AGGCTGTGCTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTGTGGCCGTGCTGTGGG CBADADCFFCEEFDEFDF>CA@CDAECBEBE=EDEDDDDABB<ADBB@@B@ABBC@BCAAAB@@BCBB@BBBBAB?;ACBBBABABABB;CCDDCBA?>@ MC:Z:100M BD:Z:MNPONNRSQPLOLNNLPOMOKNMLKMNLOONMKONNMJJNNLKNNMLKMNMJMKMNPONJMLKMOPONNNMNMMONLNNKKMJJJNNOMMKPRQMNNJJJ MD:Z:9C90 PG:Z:MarkDuplicates RG:Z:run_CLL004-P6 BI:Z:QORQQQTTSRMPNROMRPOQNQOONOONPPONLQPPPKKNPOMNPNNLOONMOLPNQPOKNLLOOQPOMONOONPPNPPNLOMMMONPONMPQQNOPLNN NM:i:1 MQ:i:60 AS:i:95 XS:i:20 Xc:Z:1|43354136|C|T
SRR5229653.1238161 163 1 43354104 60 100M = 43354135 131 CCAGCTTCAGGTCGTACAGACGCAGTCTGTGCTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGAC >A=ABCACCAB>C;@@ABCAA;BBC?BBB@BBBBAAACA@CB;BBA@BAC?CCCB;ADBADDCDED=DEBBAADD?DBEDEDDDEBBBEFEDEFDACA@B MC:Z:100M BD:Z:JJNOTRMNNPOMOMNMMJOLNLNOONOKNJJNONMJLMNLMPLNOJMNKLNOKNNLOMPOONJJNNKLNNNKKNMMJNMOOPONKMMLOOQPOQQRONLN MD:Z:25G6C67 PG:Z:MarkDuplicates RG:Z:run_CLL004-P6 BI:Z:NNPQSRNNPQOPPNPONMPNONOPPPPNPMMPQPNKONNONQNOPMNOLNPPNPPNONQPPNKKPNMOPNOLNPOQNQOOQRQOLPNPRRSSQSRSQOPP NM:i:2 MQ:i:60 AS:i:90 XS:i:19 Xc:Z:1|43354136|C|T
SRR5229653.1238161 83 1 43354135 60 100M = 43354104 -131 CTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTGTGGCCCTGCTGTGGGTGCCCTCG D@ACDEEFEF>FEAEEBEDCFCF=EEEDDDDDED=BCCDBABC@CCDABCB@@B@@BBCB@BABBAC?;BBBBBABABAB@@*BBCBBBABCCDCBB>8@ MC:Z:100M BD:Z:ONJMOQPNQPNPKNNMLNOMPPNMKONNMJJNNLKNNMLKMNMJMKMNPONJMLKMOPONNNMNMMONLNNKKMJJJNNNJNOOPOJKOKNKQPMRMOJJ MD:Z:1C80G17 PG:Z:MarkDuplicates RG:Z:run_CLL004-P6 BI:Z:QPLOPTROTRPROROONPPOQQPOMQPPPKKNPOMNPNNLOONMOLPNQPOKNMLOPQPONONPONPPMOPMKOMMMONPKNPPQPMMOKNMPQLPOPNN NM:i:2 MQ:i:60 AS:i:93 XS:i:20 Xc:Z:1|43354136|C|T
SRR5229653.1238179 163 1 43354119 60 100M = 43354180 161 ACAGACGCAGGCTGTGCTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTGTGGCCGT @@=BAB:CCABCCB?CBBBAA?CBACB;BBA@7@C?CCCB;@BCBCBBBDC<CDCA@@CC=AA@DEDBDD@A@CBDDDEDBDDBD=ECFCF@D?ACC?;= MC:Z:100M BD:Z:JJJOPQNPPPONONJJNONMJLMNLMPLNOJMNKLNOKNNLOMPOONJJNNKLNNNKKNMMJNMNNONMJLLKNNONMNOOPOMOMOPPMOPLMMQNNKN MD:Z:17C82 PG:Z:MarkDuplicates RG:Z:run_CLL004-P6 BI:Z:NNNQPPOPQQOPQPMMPQPNKONNONQNOPMNOLNPPNPPNONQPPNKKPNMOPNOLNPNPMPNNPQPNKOMNPQRQOQOPQOPPOPRRPPSPPQRQONQ NM:i:1 MQ:i:60 AS:i:95 XS:i:20 Xc:Z:1|43354136|C|T
SRR5229653.1238179 83 1 43354180 60 100M = 43354119 -161 GAGGCTGGGAATGCTGGCCAGGACGCAGAGTGTGGCCGTGCTGTGGGTGCCCTCGGGCCCCAAGAGTGTCTGCTGGCCACTCGCTGTGGCCACCACCCCT DBADDCEEDBBCFFCEFEEEEFA=EEDEDCBCBDEC5CADEAC?CCB@DCCC@:AACAAAB@BBBABAACBCA?@BBA?CA;BCBABABBC>BC=BA?B@ MC:Z:100M BD:Z:KMNPSRMPNMOQQPOOONONNPNLNNKKMJJJNNNLLJOPOJJNJMJONJNMOKJNNJJMLKKKMJJNLOOPONNNMJNMOLPOJJNOONKNOLPNJNJJ MD:Z:100 PG:Z:MarkDuplicates RG:Z:run_CLL004-P6 BI:Z:MQORTSOQPOQRSRPOQOQQOQQOQPNLOMMMONPONMPQPMMOKNMPPKNNOMKNPKKNNLNLNMMOMOPPOONPNMPNONQPMMONPNMPNNQMLONN NM:i:0 MQ:i:60 AS:i:100 XS:i:0
SRR5229653.1238185 147 1 43354199 60 100M = 43354134 -165 AGGACGCAGAGTGTGGCCGTGCTGTGGGTGCCCTCGGGCCCCAAGAGTGTCTGCTGGCCACTCGCTGTGGCCACCACCCCTCTCCCTCTGCTACAGCTCC CBB>=EEEFEDCC@CBD=CBEEBCBDDBBDDDEA=DDDDCBC@CDCBBBACBCCBACBC>CA;BCBAB@BAB=BC>AABA@C@BACAABCC??CCCB=A@ MC:Z:100M BD:Z:MMONPQQMMOKKKOOOMMKPQOJJNJMJONJNMOKJNNJJMLKKKMJJNLOOPONNNMJNMOLPOJJNNNMJMMJMJJNMLMLJNMLOOQONLPRTMLJJ MD:Z:100 PG:Z:MarkDuplicates RG:Z:run_CLL004-P6 BI:Z:QOQQRTSQOQOOOPOQPONQRQNNPLNMPPKNNOMKNPKKNNMNLOMMONPPQPONPNMPNONQPMMONPNMPNMPKKNNNNOKNNNPPRPPNQRSOPNN NM:i:0 MQ:i:60 AS:i:100 XS:i:0
SRR5229653.1238185 99 1 43354134 60 100M = 43354199 165 GCTGGGAGAAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCTGGGAATGCTGGCCAGGACGCAGAGTGTGGCCGTGCTGTGGGTGCCCTC @A@BBCBD3@CC;BBA@BAC@BBBB;ACBBCBABB@;BCABA@CB>CADCDCCCCA>ACEDDDEDBEDBC=EAECC>D@DCEE>@EFEEAEEE?B>A?BD MC:Z:100M BD:Z:JJONQMNOMMMQMNOJMNKLNOKNNLOMPOONJJNNKLNNNKKNMMJNMNNONMJLLKNNONMNNNONLNLNOOLMNJKKNOOLOKOPPLLOMPNRNJMK MD:Z:2C5G91 PG:Z:MarkDuplicates RG:Z:run_CLL004-P6 BI:Z:NNRQPLPNNMNQNOPMNOLNPPNPPNOMPPOMKKPMMOONOLNPNPMPNNPQPNKOMNPPQPNPNOPNOONOPPNOQNNNOQONQNRSROPQNTQTOLQM NM:i:2 MQ:i:60 AS:i:92 XS:i:20 Xc:Z:1|43354136|C|T
SRR5229653.1238236 145 1 43354308 60 100M = 43354086 -322 TGGGCCTCCCAGCGGCTCTGCTCTTGGATGAGCAAGTGGAAGGAGTAGTGCATTTCAGTCTCATCGTAGGGGCTGGGCTCCTGGCTGGGAGGCGCCAAGG ?BACDECEEEEF>EEFBEBEEBEABDEACDDDCBDCCCDADADBB@BABBB@AAABBAACAB@A;A@B@@@BCB@@BCAACB@BCB@ACCAA:CBC@?A@ MC:Z:100M BD:Z:NJNNRPOLOPPOLOQNMPPQNLLKNMOMMKONLKMJNMLKMMKMMMMJONNKBLNNMNLMNNNOLMMMJJNPONJNPMLNONNPONJMKNOONPPPKMJJ MD:Z:100 PG:Z:MarkDuplicates RG:Z:run_CLL004-P6 BI:Z:PLOQRRRNQRSQOOROOQQROOOMPOOOOLQPNMOMONMMPNLONOOMPPPNEMPPOONNPPPONNOPKKNQPOKNQNONPONQPOKNLQOPOQOPNQNN NM:i:0 MQ:i:60 AS:i:100 XS:i:0
SRR5229653.1238236 97 1 43354086 60 100M = 43354308 322 CCTCCCAGGACGTGGGGTCCAGCTTCAGGTCGTACAGACGCAGGCTGTGCTGGGAGGAGCGCACCAAGTCTGCGAGCAGGGGCCGGCCAATGGTGAGGCT @A@BBCCDCBB;?CBAA>AABCBB@BBCA?C;?@ABCAA;BBCABBC@CCBCBBADABDE=EBCDBAE@EDDE=CEECFEEEFE>EFC@@AEC?D?DAAC MC:Z:100M BD:Z:JJMKQMPQOMOMOJMJJMOMNOPOKLMONMOMNMMJOLNLNOONNONJJNONMJLMNLMPLNOJMNKLNOKNNLOMPOPOKKOOLMOOPMMPPPNRMNNO MD:Z:50C49 PG:Z:MarkDuplicates RG:Z:run_CLL004-P6 BI:Z:NNQMPLPPNOONPMNKKPPNOPQQMMOOMPOMPONLPNNNOPPNPQPMMPQPNKONNONQNOPMNOLNPPNPPNOORQQOLLQONPRPQNQSQTQTOOQR NM:i:1 MQ:i:60 AS:i:95 XS:i:20 Xc:Z:1|43354136|C|T
Yes, that is what I want. Now I can find the DNA fragments at support the variant allele. It would great if the program support also indel variant. But I am right happy now.
Thanks a lot
great , furthermore it was a tool I needed. please, mark https://www.biostars.org/p/322664/ as answered please.
Verify
Subject of the issue
According to Biostar issue https://www.biostars.org/p/322664/, I still find that your program also report reads that don't support the variant allele.
Your environment
java -jar dist/biostar322664.jar --version 5f6b66bc05201d2d543e1b1214640dd5c84051f8 java version "1.8.0_40" /cluster_name/13.1/x86_64/jdk/jdk1.8.0_40 openSUSE 13.1 (Bottle) (x86_64)
Steps to reproduce
$ java -jar dist/biostar322664.jar -V minivcf.vcf minisam.sorted.bam
minisam.sorted.bam.zip minivcf.vcf.zip
Expected behaviour
The program to report reads that support variant allele.
Actual behaviour
The program reports all reads aligning at variant location.