Open dcm9123 opened 4 years ago
As far as I know that's not possible using a specific kraken command, but you can remove contaminants either downstream or upstream of your analysis. You can look at decontam which you can incorporate in your pipeline https://github.com/benjjneb/decontam
Currently, no, we do not have a script for that. Kraken-related scripts can be found at https://github.com/jenniferlu717/KrakenTools.
The extract_kraken_reads.py script can allow you to modify the samples by removing sequences (--exclude) matching a given set of taxonomy IDs (and their --children and/or --parent taxids).
If you know the taxonomy IDs classified in the water control, you can provide them to that script.
Otherwise, pavian (visualization tool https://github.com/fbreitwieser/pavian) can be used to compare the samples and then you can subtract the water control reads from the other sample reads. Let me know if you have any questions.
Is there a plan to also output kmer index (or uniq id) in kraken results (*.out file), so that kmer existing in water control can be excluded from real samples
Hello!
I am relatively new to metagenomics and I've been using Kraken2 for my analysis. I was wondering if Kraken2 has some sort of way of removing DNA reads that do not belong to the sample analyzed. For instance, I've been doing analysis on pulmonary tract of patients (n=6) and a water control. In my water control I encountered a relatively low number of bacteria, archaea, viruses, and a lot of human. Is there any way that kraken2 eliminates the water control reads from the clinical samples analyzed? I am guessing that the WC reads belong to lab contamination and handling of samples and material.
Thanks in advance,
Daniel