Adamtaranto / frisk

Screen genomic scaffolds for regions of unusual k-mer composition.
http://adamtaranto.github.io/frisk/
GNU General Public License v3.0
2 stars 3 forks source link

Add Self-genome Masking #24

Closed Adamtaranto closed 8 years ago

Adamtaranto commented 8 years ago

Given repeat-masked genome, learn k-mer abundance only from unmasked regions.

For use training 'self' for non-self-rich genomes i.e. High RIP/TE abundance fungal genomes.

Adamtaranto commented 8 years ago

Added maskHost argument. If set will ignore lowercase masked sequence when calculating genome total kmer counts from the 'Host Sequence'.

Will still assess kmers containing masked characters windows in querySeq, even if query seq is same as host seq.

Need to test before closing.