dnbaker / dashing2

Dashing 2 is a fast toolkit for k-mer and minimizer encoding, sketching, comparison, and indexing.
MIT License
62 stars 7 forks source link

D10 #48

Closed dnbaker closed 2 years ago

dnbaker commented 2 years ago
  1. Containment fixes
    1. --containment was setting distance to SYMMETRIC_CONTAINMENT instead of CONTAINMENT.
    2. Union/intersection cardinality estimation was wrong due to missing parenthesis. Only sketch-based methods are affected.
  2. Minimizer sequence transduction
    1. Add in alphabet to serialized form of minimizer sequence transduction.
    2. Add parse_minimizer_sequence_set to Python parsing code for reading these databases.
    3. Added printmin subcommand, which formats minimizer sequence sets to human-readable (fasta or tabular).