chevrm / transPACT

transPACT - trans-AT PKS Annotation and Comparison Tool
GNU Affero General Public License v3.0
3 stars 0 forks source link

transPACT

trans-AT PKS Annotation and Comparison Tool

logos

transPACT is a joint collaboration between the University of Wisconsin-Madison, ETH Zurich, and Wageningen University.

Reference:

EJN Helfrich, R Ueoka, MG Chevrette*, F Hemmerling, X Lu, S Leopold-Messer, AY Burch, SE Lindow, J Handelsman, J Piel†, MH Medema†. Evolution of combinatorial diversity in trans-acyltransferase polyketide synthase assembly lines across bacteria. 2021. Nature Communications 12, 1422. 10.1038/s41467-021-21163-x

* equal contributions

† to whom correspondance should be addressed; JP: jpiel (at) ethz.ch | MHM: marnix.medema (at) wur.nl

Brief description

Trans-acyltransferase polyketide synthases (trans-AT PKSs) are multimodular enzymes that biosynthesize diverse pharmaceutically and ecologically important natural products. Here, we developed and applied a phylogenomic algorithm, transPACT (trans-AT PKS Annotation and Comparison Tool), to perform a global computational analysis of trans-AT PKS gene clusters, identifying hundreds of evolutionarily conserved module blocks. Network analysis of their exchange patterns reveals a widespread diversification mechanism for these enzymes. transPACT implementation to assign substrate specificity to trans-AT PKS's ketosynthase (KS) domains can be found within this repository, as well as helper scripts used to generate the global trans-AT PKS network. transPACT is typically run independently, but is built within the antiSMASH 4.x architecture [paper] [repo].

Set up environment

Dependencies are listed in conda_packages.txt. It is highly suggested for users to create their own conda environment using this file, e.g.:

conda create --name transPACT --file conda_packages.txt

This creates a new environment called transPACT with all dependencies installed. This environment can now be accessed by:

conda activate transPACT

Install/setup time on a "normal" desktop computer should be less than 5 minutes. In tests, setup completed in 26 seconds with: date && git clone https://github.com/chevrm/transPACT.git && cd transPACT && conda create --name transPACTtest --file conda_packages.txt && conda activate transPACTtest && date

Running transPACT to assign KS substrate specificity

What's actually happening when I run transPACT?

The core transPACT algorithm is found at antismash/specific_modules/nrpspks/nrpspksdomainalign/substrate_from_faa.py. It has been symbolically linked at transPACT_substrate_from_faa.py for user convenience. For each ketosynthase domain (input as a protein fasta), KSs are aligned to a reference alignment of a core set of 647 experimentally characterized KS domains with MUSCLE (see align_ks_domains(); invoked on line 533). This alignment is used to phylogenetically place the query sequence onto a reference phylogeny (placement with pplacer; see run_pipeline_pplacer(); invoked on line 534) and query sequences are assigned to a clade and functional classification based on monophyly (see parse_pplacer()).