DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
683 stars 266 forks source link

kraken2 assigning exactly the same taxonomy at low magnitudes to very different samples #806

Open ctuni opened 4 months ago

ctuni commented 4 months ago

We've been running kraken2 with our RNA-Seq samples to check for contamination for a year more or less.

At first we were surprised to find some contamination that we deemed as "exotic" for plasma RNA samples. This is a screengrab of a Krona plot done with kraken2 results last year, at LAB A: image As you can see the magnitude is very low but we were still surprised to find these exotic contaminants in a sample that comes from plasma.

We thought this was contamination because we were sharing the lab with a zoology group.

We've since moved labs and we processed newer samples this year, in LAB B which is not shared with other groups, and we found this: image The proportions are not the same but the distribution of the species found is.

This repeats across all of our samples, that were processed over some time and in different places, but using the same kraken2 version, with the same database, in a Docker container.

Could this be an artifact? If it is we'd like to get some insight on why this happens so we know what to check and look for on the reports. Thank you!

jenniferlu717 commented 3 months ago

What database are you using? And what read lengths?