usegalaxy-eu / project-ideas

A collection of project ideas suitable for Master and Bachelor students
MIT License
9 stars 2 forks source link

Combining gene cluster detection with metatranscriptomics analysis #37

Open paulzierep opened 1 year ago

paulzierep commented 1 year ago

Combining gene cluster detection with metatranscriptomics analysis

Supervisor: Paul Zierep For degree: Project/Master Status: Open Keywords: Gene Cluster, metatranscriptomics

Note: Probably needs to be split up into two projects. A) BiG-SCAPE and BiG-MAP and workflows B) Combining with metatranscriptomics and analysis of soil samples

Global Biological/Research context

Gene cluster encode proteins that are involved in the production of secondary metabolites with valuable pharmaceutic activities, such as antibiotic and antifungal properties [1]. Secondary metabolites are associated with evolutionary advantages of the producing species due to the mediation of antagonistic interactions [2]. Most investigations of the gene cluster harboring species focus on isolated samples. With the advance of NGS and effective metagenopmics and metatranscriptomics pipelines, it is possible to investigate species in context of their community. Especially, metatranscriptomics allows for the quantification of gene cluster transcription, in contrast to (meta-)genomic approaches that can only state the presence of the gene cluster. In order to find correlations between gene cluster regulation and community composition, a combined analysis approach using gene cluster detection and quantification as well as community profiling could be designed. The findings could be especially interesting for the search of novel pharmaceutical compounds. E.g. the expression of potential antibiotic and antifungal compounds could be associated with specific community compositions.

Project context

The Freiburg Galaxy team together with the microGalaxy community developed workflows metatranscriptomics toturial and integrated tools (HUMAnN, MetaPhlAn) that allow for metatranscriptomics analysis of NGS data. Now, this workflow could be adapted to integrate gene cluster detection and quantification in order to demonstrate the straightforward coupling of complex workflows using galaxy. Gene cluster detection is already integrated in Galaxy, by Antismash 5. The created workflow could be applied to investigate the described research question for a given target environment.

Objectives of the project

Integrate BiG-SCAPE [3] and BiG-MAP [4] into galaxy for Gene Cluster Families and Gene cluster Meta’omics abundance analysis. Combine the analysis with metatranscriptomics community profiling. Investigate soil metatranscriptomics samples, since Streptomyces are one of the biggest producer of GBKs. Correlate GBKs with community profiles.

Proposed agenda for the project

  1. Contact antiSMASH (M. Medema) to discuss the idea.
  2. Write BiG-SCAPE wrapper (https://github.com/medema-group/BiG-SCAPE/wiki)
  3. Write BiG-MAP wrappers (https://github.com/medema-group/BiG-MAP)
  4. Run BiG-MAP workflow for soil metatranscriptomics samples using only known Gene Clusters from MIBiG
  5. Run metatranscriptomics workflow for the same samples
  6. Correlate GBKs with community profiles and pathways (e.g. compare antibiotic and antifungal GBKs with the community)

Prerequisites

Further reading and useful links

[1] L. Donald, A. Pipite, R. Subramani, J. Owen, R. A. Keyzers, and T. Taufa, “Streptomyces: Still the Biggest Producer of New Natural Secondary Metabolites, a Current Perspective,” Microbiology Research, vol. 13, no. 3, Art. no. 3, Sep. 2022, doi: 10.3390/microbiolres13030031. [2] Q. Yan et al., “Secondary Metabolism and Interspecific Competition Affect Accumulation of Spontaneous Mutants in the GacS-GacA Regulatory System in Pseudomonas protegens,” mBio, vol. 9, no. 1, pp. e01845-17, Jan. 2018, doi: 10.1128/mBio.01845-17. [3] J. C. Navarro-Muñoz et al., “A computational framework to explore large-scale biosynthetic diversity,” Nat Chem Biol, vol. 16, no. 1, pp. 60–68, Jan. 2020, doi: 10.1038/s41589-019-0400-9. [4] V. Pascal Andreu, H. E. Augustijn, K. van den Berg, J. J. J. van der Hooft, M. A. Fischbach, and M. H. Medema, “BiG-MAP: an Automated Pipeline To Profile Metabolic Gene Cluster Abundance and Expression in Microbiomes,” mSystems, vol. 6, no. 5, pp. e00937-21, Sep. 2021, doi: 10.1128/mSystems.00937-21.