Closed nalandaatmi closed 9 years ago
Hi, generally when Kraken leaves a read without a classification, that's because there's no sequence with close enough homology to find a match. It would not surprise me if the Naive Bayes method used by FCP (I believe that's what metAMOS would use here) is slightly more sensitive than Kraken, however such a high jump in classification percentage between Kraken and FCP makes me suspect that FCP is over predicting (i.e., is lacking precision) on this dataset. Unfortunately, I can't really state anything conclusively without really looking through the data.
Dear Derrick,
Query regarding Annotation: My metagenomics forward and reverse fastq files have 20 million reads. After removing plant similar reads from my input fastq files using (fastq_screen pipeline), I had 4 million reads. Then I provided this fastq file (4 million reads) as input to metAMOS pipeline. FCP option has annotated those reads but each of the custom kraken database and minikraken did not annotate as expected. What could have been the reason?
But for the initial fastq files (with 20 million reads), kraken custom DB based on nt database annotated correctly.
I tried four different databases with metAMOS pipeline. 1) Using minikraken database (DB size 4.5GB), for these 4 million reads I received an output with no hits in annotation.
2) Using custom kraken database (Bacterial, Viral, Archaeal, Fungal) (DB size 105GB), for these 4 million reads.
3) Using custom kraken database (nt database from ncbi) (DB size - 604GB), for these 4 million reads.
4) Using FCP database, for these 4 million reads.