Add Count Thresholding - Githubissues

It is often useful to exclude low abundance (erroneous) or high abundance (repeat associated) kmers from a count table.

As a user I'd expect a method called .min() to return all the kmers with the minimum observed count and .max() to be all kmers with the max observed count.

For thresholding at some cutoff value, maybe something like .mincut() and .maxcut() ?

Suggested use:

table = oxli.KmerCountTable(3)
kmers = ["AAA", "GGG", "GGG"]

for kmer in kmers:
    table.count(kmer)

table.mincut(2)
>> "Dropped 1 hash with fewer than 2 counts."

table.get("AAA")
>> 0

table.get("GGG")
>> 2

@ctb?

oxli-bio / oxli

Add Count Thresholding #18