dahak-metagenomics / dahak

benchmarking and containerization of tools for analysis of complex non-clinical metagenomes.
https://dahak-metagenomics.github.io/dahak
BSD 3-Clause "New" or "Revised" License
21 stars 4 forks source link

Considerations for taxonomic classification - kmer sizes #30

Open brooksph opened 7 years ago

brooksph commented 7 years ago

Large kmers help to sort out strain variants thus using a k of 51 with sourmash or kraken may be the best way to go.

Expected behavior

Actual behavior

Steps to reproduce the behavior

kternus commented 6 years ago

@brooksph It's good to provide default recommendations for k, but I like that you included sourmash databases with k of 21, 31, and 51 in the dahak taxonomic classification workflow because it's interesting to see how that alters the results.