Pindel can detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data. It uses a pattern growth approach to identify the breakpoints of these variants from paired-end short reads.
pindel2vcf switched from getline to reading fasta char by char "Version 0.6.2 [December 12th, 2014] Now robust against fasta files that have non-standard line lengths (C++'s getline does not work well on lines of over a million characters)"
istream getline has that issue, std::getline will expand and doesn't have an issue with lines of any length. This patch restores previous code, from the svn repo, but switched to std::getline instead of the implicit.
for a ~400mb plant assembly with 1300 contigs, processing time dropped to 582 seconds from 2043
for a ~400mb plant assembly with 14 contigs, processing time dropped to 103 seconds from 154
for a 5mb bacterial assembly with 1 contig, difference wasn't reliably detectable.
pindel2vcf switched from getline to reading fasta char by char "Version 0.6.2 [December 12th, 2014] Now robust against fasta files that have non-standard line lengths (C++'s getline does not work well on lines of over a million characters)"
istream getline has that issue, std::getline will expand and doesn't have an issue with lines of any length. This patch restores previous code, from the svn repo, but switched to std::getline instead of the implicit.
for a ~400mb plant assembly with 1300 contigs, processing time dropped to 582 seconds from 2043 for a ~400mb plant assembly with 14 contigs, processing time dropped to 103 seconds from 154 for a 5mb bacterial assembly with 1 contig, difference wasn't reliably detectable.