Supervisor: Bérénice Batut
For degree: Master project
Status: Open
Keywords: Galaxy, Tool, Workflow, Metagenomics
Global Biological/Research context
Microbiome is the collection of all microbes, such as bacteria, fungi, viruses, along with their genes, which live inside and outside our bodies in all environments surrounding us [1]. To investigate microbiomes, researchers use sequencing data and microbiome analyses [2] . These analyses rely uses sequencing data to investigate microbiomes. Such analysis relies on sophisticated computational approaches: assembly, binning, taxonomic classification, functional profiling etc. Analysing microbiome data makes it possible to answer the two main questions for most microbiome analysis
who (microorganisms) are there: by extracting the community from the microbiome reads
what are they doing (and how): by extracting the gene/pathway abundance profile from the metagenomics reads and transcript abundance profiles from the metatranscriptomics reads and combining them
These analyses rely on bioinformatics tools and also databases [3,4]. Few workflows [5,6,7] to process this data are available and most are not openly available, not transparent or not easy to use by researchers. To tackle this problem, the Freiburg Galaxy team together with the microGalaxy community use Galaxy [8] to build workflows to analyse microbiome sequencing data.
Project context
MGnify offers an automated pipeline for the analysis and archiving of microbiome data to help determine the taxonomic diversity and functional & metabolic potential of environmental samples.
The pipeline even if documented is not really usable outside their resources. We would like to offer this pipeline for Galaxy users.
Objectives of the project
Run the Mgnify pipeline
Integrate the tools in Galaxy and build the pipeline connecting the tools
Benchmark the Galaxy pipeline against the original one
[1] Martin J. Blaser. “The microbiome revolution” The Journal of Clinical Investigation (2014): 124.
[2] Sharpton, Thomas J. "An introduction to the analysis of shotgun metagenomic data." Fontiers in plant science 5 (2014): 209.
[3] Oulas, Anastasis, et al. "Metagenomics: tools and insights for analyzing next-generation sequencing data derived from biodiversity studies." Bioinformatics and biology insights 9 (2015): BBI-S12462.
[4] Escobar-Zepeda, Alejandra, Arturo Vera-Ponce de León, and Alejandro Sanchez-Flores. "The road to metagenomics: from microbiology to DNA sequencing technologies and bioinformatics." Frontiers in genetics 6 (2015): 348.
[5] Mehta, Subina, et al. "ASaiM-MT: a validated and optimized ASaiM workflow for metatranscriptomics analysis within Galaxy framework." F1000Research 10 (2021).
[6] Mitchell AL, et al. “MGnify: the microbiome analysis resource in 2020” Nucleic Acids Research (2019), doi:10.1093/nar/gkz1035.
[7] Wooley, John C., Adam Godzik, and Iddo Friedberg. "A primer on metagenomics." PLoS computational biology 6.2 (2010): e1000667.
[8] Enis Afgan, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update, Nucleic Acids Research, Volume 46, Issue W1, 2 July 2018, Pages W537–W544, doi:10.1093/nar/gky379
Supervisor: Bérénice Batut For degree: Master project Status: Open Keywords: Galaxy, Tool, Workflow, Metagenomics
Global Biological/Research context
Microbiome is the collection of all microbes, such as bacteria, fungi, viruses, along with their genes, which live inside and outside our bodies in all environments surrounding us [1]. To investigate microbiomes, researchers use sequencing data and microbiome analyses [2] . These analyses rely uses sequencing data to investigate microbiomes. Such analysis relies on sophisticated computational approaches: assembly, binning, taxonomic classification, functional profiling etc. Analysing microbiome data makes it possible to answer the two main questions for most microbiome analysis
These analyses rely on bioinformatics tools and also databases [3,4]. Few workflows [5,6,7] to process this data are available and most are not openly available, not transparent or not easy to use by researchers. To tackle this problem, the Freiburg Galaxy team together with the microGalaxy community use Galaxy [8] to build workflows to analyse microbiome sequencing data.
Project context
MGnify offers an automated pipeline for the analysis and archiving of microbiome data to help determine the taxonomic diversity and functional & metabolic potential of environmental samples.
The pipeline even if documented is not really usable outside their resources. We would like to offer this pipeline for Galaxy users.
Objectives of the project
Proposed agenda for the project
Prerequisites
Further reading
Mgnify
Galaxy
References
[1] Martin J. Blaser. “The microbiome revolution” The Journal of Clinical Investigation (2014): 124. [2] Sharpton, Thomas J. "An introduction to the analysis of shotgun metagenomic data." Fontiers in plant science 5 (2014): 209. [3] Oulas, Anastasis, et al. "Metagenomics: tools and insights for analyzing next-generation sequencing data derived from biodiversity studies." Bioinformatics and biology insights 9 (2015): BBI-S12462. [4] Escobar-Zepeda, Alejandra, Arturo Vera-Ponce de León, and Alejandro Sanchez-Flores. "The road to metagenomics: from microbiology to DNA sequencing technologies and bioinformatics." Frontiers in genetics 6 (2015): 348. [5] Mehta, Subina, et al. "ASaiM-MT: a validated and optimized ASaiM workflow for metatranscriptomics analysis within Galaxy framework." F1000Research 10 (2021). [6] Mitchell AL, et al. “MGnify: the microbiome analysis resource in 2020” Nucleic Acids Research (2019), doi:10.1093/nar/gkz1035. [7] Wooley, John C., Adam Godzik, and Iddo Friedberg. "A primer on metagenomics." PLoS computational biology 6.2 (2010): e1000667. [8] Enis Afgan, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update, Nucleic Acids Research, Volume 46, Issue W1, 2 July 2018, Pages W537–W544, doi:10.1093/nar/gky379