Closed Adamtaranto closed 8 years ago
Added maskHost argument. If set will ignore lowercase masked sequence when calculating genome total kmer counts from the 'Host Sequence'.
Will still assess kmers containing masked characters windows in querySeq, even if query seq is same as host seq.
Need to test before closing.
Given repeat-masked genome, learn k-mer abundance only from unmasked regions.
For use training 'self' for non-self-rich genomes i.e. High RIP/TE abundance fungal genomes.