mcveanlab / mccortex

De novo genome assembly and multisample variant calling
https://github.com/mcveanlab/mccortex/wiki
MIT License
113 stars 25 forks source link

Breakpoint caller should use novel kmers #13

Open noporpoise opened 9 years ago

noporpoise commented 9 years ago

The breakpoint caller should mark kmers if they are used in a call. Unused kmers that are in a sample and not the referece ("novel kmers") should be used to seed a second breakpoint caller.

The novel kmer breakpoint caller should perform a breadth-first search (BFS) to find the nearest kmer that occurs in the reference either side. The shortest path between these kmers should be reported as a putative breakpoint call. This caller will have a high false positive rate, but may catch a few reasonable SNPs / indels.