Nonribosomal peptide synthetases are enzymes that synthesize peptides independent of messenger RNA and ribosomes. Each NRPS enzyme contributes a specific step to the synthesis of a specific peptide. Each enzyme typically contains multiple catalytic domains that help to accomplish a specific step in peptide synthesis. Multiple NRPS enzymes are usually required to a synthesize a peptide and these enzymes are usually co-located together in the genome (and co-expressed on polycistronic transcripts in the case of bacteria).
Example of how NRPSs accomplish peptide synthesis: pyoverdine synthesis in P. aeruginosa
Synthesis of pyoverdine in Pseudomonas aeruginosa provides a good examples of NRPS synthesis (see image below, modified from https://www.nature.com/articles/s41467-020-18365-0). Each gene (PvdL, PvdI, PvdJ, and PvdD) is an NRPS with many catalytic domains (the bubbles below the gene, where abbreviations are ACL acyl-CoA ligase, C condensation, A adenylation, T thiolation, E epimerisation, and Te thioesterase.). "Modules are comprised of one or more key domains, including adenylation (A) domains, which recognise and activate the monomer substrate; condensation (C) domains, which catalyse amide bond formation; and thiolation (T) domains, which shuttle reaction intermediates between catalytic domains" (source: https://www.nature.com/articles/s41467-020-18365-0).
This article provides a full overview of how pyoverdine synthesis works:
The first step in the PVDI synthesis starts with the enzyme PvdL that couples fatty acid (myristic or myristoleic) to a coenzyme A. PvdL is an atypical PVD-synthesis among NRPSs since it does not contain an initial C-terminal domain and includes an unusual domain which is related to acyl coenzyme A ligases [57]. The second step is the incorporation of the coenzyme A–fatty acid complex with an L-Glu moiety by PvdL. The main purpose of this fatty acid presence is to keep the precursor in the inner membrane [58]. The hydrolysis of fatty acid occurs prior to the siderophore excretion outside the cell. Then, PvdL integrates D-Tyr and L-Dab moieties that are condensed together to form a tetrahydropyrimidine ring that is the precursor to the dihydroxyquinoline chromophore [57]. In the final step, PvdL catalyzes the addition of the thus formed pyoverdine precursor to a D-Ser amino acid, which is the first amino acid of PVDI peptide moiety. The only NRPS present in all Pseudomonas genome is PvdL [57]. The PvdI and PvdJ enzymes further elongate the peptidic part through condensation and partial cyclization of eight amino acids. The enzyme PvdH catalyzes L-Dab synthesis, while PvdA and PvdF catalyze the formylhydroxyornithine synthesis [59,60]. In the end, the PvdD enzyme terminates the peptidic part via the activity of its thioester domain that enables ferribactin release into the cytoplasm. Subsequently, this molecule will be exported across the inner membrane by PvdE ABC-transporter [61].
Example of NRPS in animals: Nemamides in C. elegans
The first detection of an NRPS in a metazoan was nemamides in C. elegans (10.1038/nchembio.2144). The image below shows how nemamides are synthesized as well as the locations of the PKS, NRPS, and 5 other genes required to synthesize nemamides (10.1038/s41467-021-24682-9). As can be seen, genes required for synthesis are not co-located in the genome in this instance.
a The domain organization of PKS-1 and NRPS-1 is shown, along with five additional free-standing enzymes (NEMT-1, PKAL-1, C32E8.6, C24A3.4, and Y71H2B.1) that were demonstrated in this study to be required for nemamide biosynthesis. To facilitate annotation of the mutant worm strains generated in this study, the enzyme domains have been numbered according to the order of their appearance in PKS-1 and NRPS-1. The ACP7 domain was identified and its functional role was confirmed in this study. Domain abbreviations: acyl carrier protein (ACP), acyltransferase (AT), ketosynthase (KS), ketoreductase (KR), dehydratase (DH), peptidyl carrier protein (PCP), adenylation (A), condensation (C), thioesterase (TE). b The approximate chromosomal location in C. elegans of pks-1, nrps-1, and the five additional genes demonstrated to be required for nemamide biosynthesis in this study.
Subsequently, NRPSs have been found in other animals (10.3390/genes14091741), where they are abundant in rotifers and nematodes.
Analysis plan
We are going to remove NRPS searches from the peptigate pipeline, in favor of a separate NRPS screen strategy that will take place in a different repository. Our reasoning for this include:
NRPSs seem relatively rare in animals, so including their annotation is not currently a priority in peptigate.
As undertaken in 10.3390/genes14091741, we think it might make more sense to screen genomes for NRPS than transcriptomes, and peptigate focuses on transcriptomes.
AntiSmash seems to have complete NRPS detection and substrate detection built into it's software (doc, paper). AntiSmash was also used in a previous screen to detect NRPSs in animals (10.3390/genes14091741). Our plan is to use antismash to screen for NRPSs in genomes where we're interested in them. Aside from operating on genomes, another reason we don't wish to include the NRPS screen with antismash at this time is that AntiSmash uses an AGPL license, and we don't want to AGPL the entire repository.
Background
Nonribosomal peptide synthetases are enzymes that synthesize peptides independent of messenger RNA and ribosomes. Each NRPS enzyme contributes a specific step to the synthesis of a specific peptide. Each enzyme typically contains multiple catalytic domains that help to accomplish a specific step in peptide synthesis. Multiple NRPS enzymes are usually required to a synthesize a peptide and these enzymes are usually co-located together in the genome (and co-expressed on polycistronic transcripts in the case of bacteria).
Example of how NRPSs accomplish peptide synthesis: pyoverdine synthesis in P. aeruginosa
Synthesis of pyoverdine in Pseudomonas aeruginosa provides a good examples of NRPS synthesis (see image below, modified from https://www.nature.com/articles/s41467-020-18365-0). Each gene (PvdL, PvdI, PvdJ, and PvdD) is an NRPS with many catalytic domains (the bubbles below the gene, where abbreviations are ACL acyl-CoA ligase, C condensation, A adenylation, T thiolation, E epimerisation, and Te thioesterase.). "Modules are comprised of one or more key domains, including adenylation (A) domains, which recognise and activate the monomer substrate; condensation (C) domains, which catalyse amide bond formation; and thiolation (T) domains, which shuttle reaction intermediates between catalytic domains" (source: https://www.nature.com/articles/s41467-020-18365-0).
This article provides a full overview of how pyoverdine synthesis works:
Example of NRPS in animals: Nemamides in C. elegans
The first detection of an NRPS in a metazoan was nemamides in C. elegans (10.1038/nchembio.2144). The image below shows how nemamides are synthesized as well as the locations of the PKS, NRPS, and 5 other genes required to synthesize nemamides (10.1038/s41467-021-24682-9). As can be seen, genes required for synthesis are not co-located in the genome in this instance.
Subsequently, NRPSs have been found in other animals (10.3390/genes14091741), where they are abundant in rotifers and nematodes.
Analysis plan
We are going to remove NRPS searches from the peptigate pipeline, in favor of a separate NRPS screen strategy that will take place in a different repository. Our reasoning for this include: