Open luzhang321 opened 12 months ago
The Curtobacterium flaccumfaciens genome is contaminated. That plasmid showing up in the blast results is actually pure mouse DNA. That plasmid should not be in the database but I'm checking the standard database.
Hi, thanks very much for the quick reply. How can we know this is a contamination?
I checked my downloaded plasmid database, it is contained in my downloaded fna files of the plasmid library
kraken2-build --download-library plasmid --db refseq --threads 8
grep -A 3 'CP045290.2' library.fna
NZ_CP045290.2 Curtobacterium flaccumfaciens pv. flaccumfaciens strain P990 plasmid pCff3, complete sequence TTTTCCGCGGATTTCAAGCCTTTATTGCTATATTACAGGTCCCAGTGTGCATTTGACTTC GCATGCTTTTAGTGATTTCATGATTAAGTCAGTTATGATGTTTATTTATGACATTTTCAG
I also checked my previous downloaded standard database(2022,sep) it is also included
grep CP045290.2 seqid2taxid.map
kraken:taxid|138532|NZ_CP045290.2 138532
And I checked standard database(2023,sep) it is also included
grep CP045290.2 seqid2taxid.map
kraken:taxid|138532|NZ_CP045290.2 138532
NZ_CP045290.2 138532
it must have made it through ncbi's checks and ended up in the database. We only use completed genomes, assuming that they are non-contaminated. It would need to be reported to NCBI
Hi:) I ran kraken2.1.2 for my mice gut microbiome shotgun metagenomic analysis. I built a custom db by using the following command
I ran the classification and found out that a large percentage of my reads (>80%) can be classified either from mice or this one bacteria species Curtobacterium flaccumfaciens. (I have performed host remove by bowtie2 before kraken2) Since there is output of the classified reads. I randomly selected reads and blast with nr database. This is a read classified by kraken2 as mice This is a read classified by kraken2 as Curtobacterium flaccumfaciens
From blast result, it seems there is some overlap of the result. Does this mean that the bacteria actually from the mice genome or kraken2 can't distinguish these 2 genomes due to similarity? Do you have any suggestions for this result?
example for blast @A00551:529:H5JGVDSX5:4:1101:9372:1125 kraken:taxid|10090 AATATGGCGAAGAAAACTGAAAAAGGTGGAATATTTAGAAATGTCCACTGTAGGACGTGGAATATGGCAAGAAAACTGAAAATCATGGAAAATGAGAAACATCCACATGACGACTTGAAAAATGACGAAATCACTAAAATACGTGAAAAA + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFF
@A00551:529:H5JGVDSX5:4:1101:23547:1125 kraken:taxid|138532 TGAGAAACATCCACTTGACGACTTGAAAAATGACGAAATCACTAAAAAACCTGAAAAATGAGAAATGCACACTGAAGGACCTGGAATATGGCGAGAAAACTGAAAATCACGGAAAATGAGAAAAGATCGGAAGAGCACACGTCTGAACTC + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF:FFFF:FFFFFFFFFFFFFFFFFFFF
Thanks in advance!