oxli-bio / oxli

k-mers and the like
BSD 3-Clause "New" or "Revised" License
14 stars 0 forks source link

Rename allow_bad_kmers #51

Closed Adamtaranto closed 3 weeks ago

Adamtaranto commented 3 weeks ago

Should we rename the allow_bad_kmers option in consume to make more intuitive?

This opt is passed to SeqToHashes as the force argument.

Behaviour seems to be:
True - Skip over bad kmers, do not hash them, do not raise error. False - hash/count all kmers up to the first non-DNA character then raise an error and stop processing the input sequence.

I would expect allow_bad_kmers = True to hash all kmers including those with non-dna characters, and False to skip the bad kmers and carry on with processing.

Suggest change option to skip_bad_kmers

Adamtaranto commented 3 weeks ago

This definition is at odds with the API notes

@ctb can you clarify?

ctb commented 3 weeks ago

I like the suggestion to use skip_bad_kmers!