hcdenbakker / sepia

taxonomic classifier based on the kraken2 algorithms and more
GNU General Public License v3.0
45 stars 3 forks source link

[feature request] Option to save (un)classified reads #30

Open telatin opened 2 years ago

telatin commented 2 years ago

Kraken2 allows for --unclassified-out FILE (and --classified-out FILE) and this can be handy in contamination removal steps, and could benefit from the architecture of Sepia that could allow filtering multiple samples loading the database once.

hcdenbakker commented 2 years ago

Yes, that would be a nice feature! I have a read_filter subcommand that I am going to add to sepia that does something similar after a classification run. It is currently a stand alone Rust application living on my server. It takes the sepia outputs and the original read files and allows you to either include/exclude taxa (or unclassified reads) from a dataset and set a threshold for the kmer-similarity of the in-/excluded reads.