Closed tolot27 closed 7 years ago
It's hard to diagnose without digging into the data, but let me at least explain how this works. When Pilon detects something that may be a local misassembly, it attempts to find reads from the region (along with their mates, if paired) and does a mini-reassembly. So it's not particularly counting read evidence; it's building a kmer graph from the reads, and pruning small-minority branches which are likely caused by sequencing errors. If it can re-assemble the suspicious region and get continuity across it, it will either leave it alone or open a gap (if "--fix breaks" is specified).
I don't have "breaks" turned off by default for assembly improvement because it can be fooled into falsely opening gaps, though it is on by default for variant calling because it's an important way of detecting large insertions or deletions.
In my assembly there is a genomic region with a coverage of arround 48X and there are fifteen chimeric reads which do not map completely to that region. Unfortunately, pilon breaks this region and introduces a gaps there. Hence, the remaining two-third reads are not correctly aligned anymore.
I suggest that only breaks will be fixed, if the majority of reads support the suggested fix.