DaehwanKimLab / centrifuge

Classifier for metagenomic sequences
GNU General Public License v3.0
245 stars 73 forks source link

Issues with Centrifuge Output for Functional Annotation #282

Open PoojaThummar opened 1 month ago

PoojaThummar commented 1 month ago

Hello,

I've run Centrifuge on my data, and it generated two output files: classification.txt and report.tsv. I then created a kreport.tsv file from the classification.txt file. However, I've noticed a couple of issues:

Data Discrepancy: The reads in the kreport.tsv file are reduced compared to my raw reads, which raises concerns about the accuracy and completeness of the taxonomic assignments.

Missing Sequence IDs: The taxon information in the output files does not include sequence IDs, which prevents me from generating a repseq.fa file necessary for further functional annotation.

Could you please provide guidance on how to include sequence IDs in the output and how to proceed with generating the necessary files for functional annotation?

Thank you for your assistance.

mourisl commented 1 month ago

I think if you need seuqnece ID level classification results, you can directly extract the result from the classification file. However, many reads hit multiple sequence IDs, so Centrifuge will promote the result to the LCA level. If you need more sequence-level classification, you can increase the value for the "-k" option.

What do you mean by functional annotation?

PoojaThummar commented 1 month ago

Thank you for your response. I am currently working on generating a repseq.fasta file and a .biom file for functional annotation, specifically to represent pathways with EC numbers using PICRUSt2. However, I'm encountering difficulties in creating these files.

Could you please provide guidance on how to generate a representative sequence file (repseq.fasta) from my data and a .biom file that is compatible with PICRUSt2, which requires both of these inputs? Any advice or step-by-step instructions on how to proceed would be incredibly helpful.

Thank you for your assistance!