Create_Sample_TaxID_Directories creates only directories for potential pathogens using the taxID.pathogens file instead of a file containing all taxID from the krakenuniq.output.filtered file.
Explanation of the problem:
However, since aMeta performs all the heavy steps with all organisms, I think it is suboptimal to generate the final graphs only for potential pathogens instead of all organisms. It is the Malt step that takes a lot of memory and time, but not the subsequent steps. We decided to keep all organisms for the Malt alignment to reduce misalignments, which makes sense, but then we should continue to use all organisms for the final authentication steps, which would allow this pipeline to be used for all metagenomic analyses, not just pathogen analyses. There is a lot of demand for general metagenomics, and I told people that aMeta could be used for all organisms because that is how I had used it with bash scripts before. But when I looked closely at the rules, I realized this problem. It needs to be fixed quickly because people are already testing the pipeline for general metagenomics.
Suggestion for fixing it:
It's okay to continue creating the file krakenuniq.output.pathogens
Nevertheless, the Filter_KrakenUniq_output step should also create a file called taxID.organisms containing all the taxID of organisms present in krakenuniq.output.filtered.
Then the checkpoint Create_Sample_TaxID_Directories should create directories for each organisms for each sample, using this taxID.organisms file instead of the taxID.pathogens file.
Create_Sample_TaxID_Directories creates only directories for potential pathogens using the taxID.pathogens file instead of a file containing all taxID from the krakenuniq.output.filtered file.
Explanation of the problem: However, since aMeta performs all the heavy steps with all organisms, I think it is suboptimal to generate the final graphs only for potential pathogens instead of all organisms. It is the Malt step that takes a lot of memory and time, but not the subsequent steps. We decided to keep all organisms for the Malt alignment to reduce misalignments, which makes sense, but then we should continue to use all organisms for the final authentication steps, which would allow this pipeline to be used for all metagenomic analyses, not just pathogen analyses. There is a lot of demand for general metagenomics, and I told people that aMeta could be used for all organisms because that is how I had used it with bash scripts before. But when I looked closely at the rules, I realized this problem. It needs to be fixed quickly because people are already testing the pipeline for general metagenomics.
Suggestion for fixing it: