barricklab / breseq

breseq is a computational pipeline for finding mutations relative to a reference sequence in short-read DNA resequencing data. It is intended for haploid microbial genomes (<20 Mb). breseq is a command line tool implemented in C++ and R.
http://barricklab.org/breseq
GNU General Public License v2.0
149 stars 21 forks source link

VCF Output for Insertions does not follow VCFTools standard #76

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
When outputting insertions, gdtools GD2VCF will output the location of the 
previous base. VCFTools seems to expect the position after (where the insertion 
goes). Specifically, when running vcf-consensus using the gdtools outputted VCF 
every insertion causes and error like this:

The fasta sequence does not match the REF at 3610:166052. T(ACCTAA) in .fa, A 
in .vcf, frz=166028
 at /usr/local/bin/vcf-consensus line 18, <__ANONIO__> line 11.
    main::error("The fasta sequence does not match the REF at 3610:166"...) called at /usr/local/bin/vcf-consensus line 190
    main::apply_variant(HASH(0x1008040e8), ARRAY(0x1020022d0)) called at /usr/local/bin/vcf-consensus line 109
    main::do_consensus(HASH(0x1008040e8)) called at /usr/local/bin/vcf-consensus line 9

This can be bypassed by adding 1 to every VCF POS ID that represents an 
insertion.

What is the expected output? What do you see instead?

When outputting insertions, gdtools GD2VCF will output :
CM000488    166052  .   A   AT  26.5    .   AF=1.0000
when the output that VCFTools expects is:
CM000488    166053  .   A   AT  26.5    .   AF=1.0000

What version of the product are you using? On what operating system?

breseq 0.24rc7

Please provide any additional information below.

Original issue reported on code.google.com by jonat...@allthestairs.com on 8 Aug 2014 at 3:28

GoogleCodeExporter commented 9 years ago
Thanks for catching this bug! I've corrected it for the next release. I don't 
regularly use the VCF output, so let me know if you encounter any other 
problems with it.

Original comment by jeffrey....@gmail.com on 8 Aug 2014 at 9:33