dnbaker / dashing2

Dashing 2 is a fast toolkit for k-mer and minimizer encoding, sketching, comparison, and indexing.
MIT License
62 stars 7 forks source link

Dev13 #17

Closed dnbaker closed 3 years ago

dnbaker commented 3 years ago
  1. Added seedseed (--seed), which changes the hash function used.
  2. Add edfilterset (--filterset seq.fa), which filters out k-mers in the given file for all parsed reads. Use for contamination or common sequences.
  3. Add support for compressed amino acid alphabets
    • --protein20, --protein14, --protein6, and --protein8 enable protein parsing using the reduced alphabet of that given size.