OpenGene / UniqueKMER

Generate unique KMERs for every contig in a FASTA file
MIT License
43 stars 8 forks source link

[Question] Clarification of uniqueness #5

Closed b2jia closed 2 years ago

b2jia commented 2 years ago

Thank you so much for providing this wonderful tool!

If I search a region of ie. the human genome (chr1:1:50000) against the entire human reference genome, will UniqueKMER return the list of k-mers that map once and only once in the reference genome (between chr1:1:50000?)

or

does UniqueKMER keep only k-mers that map zero times against the reference genome?

I raise this issue to highlight the setting where the input FASTA and the reference genome are derived from the same species. I think this type of tool would be useful; Bloom-filters that count k-mers lack sensitivity to count unique k-mers, as far as I know.

b2jia commented 2 years ago

I see, sorry I misread:

but not presented in any other contigs (for both forward and reverse strands).

This is clear.