Closed nalandaatmi closed 8 years ago
Dear Sergey/Treangen,
Have you got a chance to look in to this issue?
If you're not already, I'd suggest using the -u (annotate unassembled reads) option for runPipeline. This will significantly slow down FCP but Kraken should be OK. By default, only contigs are annotated but in your case more than 50% of the reads cannot be mapped to the assembled contigs (1.8M raw reads in the plots vs 3.5M input).
Kraken has a post-filtering step which only keeps sequences it could assign with sufficient confidence. These could be filtering your hits. You can turn off this filtering by editing the Utilities/config/kraken.spec and setting the 0.05 filter to 0.00. However, this will increase the false-positive rate of the classifications. FCP does not have a similar filter, I believe which is why it provides more classifications. If you have more specific questions about the classifiers, I'd suggest contacting the developers for more info.
I had similar results as above. I tried using the -u option and my results did not change. I am using metamos on a local supercomputer and am unable to change any files dealing with metamos. Is there any other suggestions for this problem?
Unfortunately, no you have to edit the Utilities/config/kraken.spec file to change the confidence setting for Kraken or use a custom database.
If you have other Kraken-specific questions, I'd suggest checking the Kraken support group.
Dear Sergey/Treangen,
Query regarding Annotation: My metagenomics forward and reverse fastq files have 20 million reads. After removing plant similar reads from my input fastq files using (fastq_screen pipeline), I had 4 million reads. Then I provided this fastq file (4 million reads) as input to metAMOS pipeline. FCP option has annotated those reads but each of the custom kraken database and minikraken did not annotate as expected. Can you comment on this issue?
I tried four different databases with metAMOS pipeline. 1) Using minikraken database (DB size 4.5GB), for these 4 million reads I received an output with no hits in annotation.
2) Using custom kraken database (Bacterial, Viral, Archaeal, Fungal) (DB size 105GB), for the same input fastq file as above.
3) Using custom kraken database (nt database from ncbi) (DB size - 604GB), for the same input fastq file as above.
4) Using FCP database, for the same input fastq file as above.