The currently used program for pathway analysis gprofiler can't handle bacteria in general and applies only to isolates, i.e. a single species opposed to metatranscriptomics.
Community composition and pathway analysis for bacterial communities, required: pre-processed reads
Required changes to workflow
Addition of software (trivial)
Achieve maximal flexibility: All parameters need to be optional, only exeption might be --metadata.
New inputs for metatranscriptome samples (feature/pathway abundance):
-- Pre-processed (optimally rRNA depleted) reads e.g. from nf-core/rnaseq v1.4+ (with parameters --remove_rRNA & --save_nonrRNA_reads), required for meta-pathway analysis
-- Optional: databases (nucleotide, protein & utilities), default: automated download
New inputs for paired metatranscriptome - metagenome samples (feature/pathway expression):
-- Pre-processed metagenomics reads e.g. from nf-core/rnaseq v1.4+
-- Either a manifest file to link samples or same sample names but different folders
Conclusion
This would be a major increase in code / parameters and output.
Pathway abundance (only metatranscriptome) would be the first step to implement, followed by addition of pathway expression analysis (RNA & DNA measures).
edit: added section "three independent analysis"
edit2: nf-core/rnaseq v1.4 pre-processing is only valid for environmental samples! For host - microbiome studies the host sequences have to be removed too!
The Problem
The currently used program for pathway analysis
gprofiler
can't handle bacteria in general and applies only to isolates, i.e. a single species opposed to metatranscriptomics.The Solution
All this software fits into the existing container without conflicts.
At least three independent analysis could be possible:
--rawcounts
--rawcounts
,--species
Required changes to workflow
--metadata
.--remove_rRNA
&--save_nonrRNA_reads
), required for meta-pathway analysis -- Optional: databases (nucleotide, protein & utilities), default: automated downloadConclusion
This would be a major increase in code / parameters and output. Pathway abundance (only metatranscriptome) would be the first step to implement, followed by addition of pathway expression analysis (RNA & DNA measures).
edit: added section "three independent analysis" edit2: nf-core/rnaseq v1.4 pre-processing is only valid for environmental samples! For host - microbiome studies the host sequences have to be removed too!