Arcadia-Science / peptigate

Peptigate ("peptide" + "investigate") predicts bioactive peptides from transcriptome assemblies or sets of proteins.
MIT License
1 stars 1 forks source link

Remove NRPS modules in favor of using antismash for annotation #17

Closed taylorreiter closed 8 months ago

taylorreiter commented 8 months ago

Background

Nonribosomal peptide synthetases are enzymes that synthesize peptides independent of messenger RNA and ribosomes. Each NRPS enzyme contributes a specific step to the synthesis of a specific peptide. Each enzyme typically contains multiple catalytic domains that help to accomplish a specific step in peptide synthesis. Multiple NRPS enzymes are usually required to a synthesize a peptide and these enzymes are usually co-located together in the genome (and co-expressed on polycistronic transcripts in the case of bacteria).

Example of how NRPSs accomplish peptide synthesis: pyoverdine synthesis in P. aeruginosa

Synthesis of pyoverdine in Pseudomonas aeruginosa provides a good examples of NRPS synthesis (see image below, modified from https://www.nature.com/articles/s41467-020-18365-0). Each gene (PvdL, PvdI, PvdJ, and PvdD) is an NRPS with many catalytic domains (the bubbles below the gene, where abbreviations are ACL acyl-CoA ligase, C condensation, A adenylation, T thiolation, E epimerisation, and Te thioesterase.). "Modules are comprised of one or more key domains, including adenylation (A) domains, which recognise and activate the monomer substrate; condensation (C) domains, which catalyse amide bond formation; and thiolation (T) domains, which shuttle reaction intermediates between catalytic domains" (source: https://www.nature.com/articles/s41467-020-18365-0).

image

This article provides a full overview of how pyoverdine synthesis works:

The first step in the PVDI synthesis starts with the enzyme PvdL that couples fatty acid (myristic or myristoleic) to a coenzyme A. PvdL is an atypical PVD-synthesis among NRPSs since it does not contain an initial C-terminal domain and includes an unusual domain which is related to acyl coenzyme A ligases [57]. The second step is the incorporation of the coenzyme A–fatty acid complex with an L-Glu moiety by PvdL. The main purpose of this fatty acid presence is to keep the precursor in the inner membrane [58]. The hydrolysis of fatty acid occurs prior to the siderophore excretion outside the cell. Then, PvdL integrates D-Tyr and L-Dab moieties that are condensed together to form a tetrahydropyrimidine ring that is the precursor to the dihydroxyquinoline chromophore [57]. In the final step, PvdL catalyzes the addition of the thus formed pyoverdine precursor to a D-Ser amino acid, which is the first amino acid of PVDI peptide moiety. The only NRPS present in all Pseudomonas genome is PvdL [57]. The PvdI and PvdJ enzymes further elongate the peptidic part through condensation and partial cyclization of eight amino acids. The enzyme PvdH catalyzes L-Dab synthesis, while PvdA and PvdF catalyze the formylhydroxyornithine synthesis [59,60]. In the end, the PvdD enzyme terminates the peptidic part via the activity of its thioester domain that enables ferribactin release into the cytoplasm. Subsequently, this molecule will be exported across the inner membrane by PvdE ABC-transporter [61].

Example of NRPS in animals: Nemamides in C. elegans

The first detection of an NRPS in a metazoan was nemamides in C. elegans (10.1038/nchembio.2144). The image below shows how nemamides are synthesized as well as the locations of the PKS, NRPS, and 5 other genes required to synthesize nemamides (10.1038/s41467-021-24682-9). As can be seen, genes required for synthesis are not co-located in the genome in this instance.

image

a The domain organization of PKS-1 and NRPS-1 is shown, along with five additional free-standing enzymes (NEMT-1, PKAL-1, C32E8.6, C24A3.4, and Y71H2B.1) that were demonstrated in this study to be required for nemamide biosynthesis. To facilitate annotation of the mutant worm strains generated in this study, the enzyme domains have been numbered according to the order of their appearance in PKS-1 and NRPS-1. The ACP7 domain was identified and its functional role was confirmed in this study. Domain abbreviations: acyl carrier protein (ACP), acyltransferase (AT), ketosynthase (KS), ketoreductase (KR), dehydratase (DH), peptidyl carrier protein (PCP), adenylation (A), condensation (C), thioesterase (TE). b The approximate chromosomal location in C. elegans of pks-1, nrps-1, and the five additional genes demonstrated to be required for nemamide biosynthesis in this study.

Subsequently, NRPSs have been found in other animals (10.3390/genes14091741), where they are abundant in rotifers and nematodes.

Analysis plan

We are going to remove NRPS searches from the peptigate pipeline, in favor of a separate NRPS screen strategy that will take place in a different repository. Our reasoning for this include: