Open telatin opened 2 years ago
Yes, that would be a nice feature! I have a read_filter
subcommand that I am going to add to sepia that does something similar after a classification run. It is currently a stand alone Rust application living on my server. It takes the sepia outputs and the original read files and allows you to either include/exclude taxa (or unclassified reads) from a dataset and set a threshold for the kmer-similarity of the in-/excluded reads.
Kraken2 allows for
--unclassified-out FILE
(and--classified-out FILE
) and this can be handy in contamination removal steps, and could benefit from the architecture of Sepia that could allow filtering multiple samples loading the database once.