kevlar-dev / kevlar

Reference-free variant discovery in large eukaryotic genomes
https://kevlar.readthedocs.io
MIT License
40 stars 9 forks source link

Filter for contigs with ambiguous calls #360

Closed standage closed 5 years ago

standage commented 5 years ago

Occasionally a contig will result in multiple distinct, equally optimal variant calls. Such contigs are rare, and each is usually associated with only 2 or 3 ambiguous calls. However, in some problematic cases a contig will result in dozens or even hundreds of calls. When using the other filters already in place, this doesn't appear to be a big issue with real data. It appears to be a bigger problem with simulated data.

Suggestion: filter out contigs that result in > 10 ambiguous calls.