geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
215 stars 39 forks source link

NTR: mycofactocin biosynthetic process #20547

Closed danielhhaft closed 3 years ago

danielhhaft commented 3 years ago

I request a new GO term.

Name: mycofactocin biosynthetic process Ontology: biological_process Definition: The chemical reactions and pathways resulting in the formation of the coenzyme mycofactocin, a variably glycosylated small molecule electron pair carrier derived from the C-terminal valine-tyrosine dipeptide of the ribosomally translated precursor peptide MftA. Is_a: heterocycle biosynthetic process (GO:0018130) Is_a: peptidyl-tyrosine modification (GO:0018212) Is_a: cellular nitrogen compound biosynthetic process (GO:0044271) Is_a: organic cyclic compound biosynthetic process (GO:1901362) Is_a: organonitrogen compound biosynthetic process (GO:1901566)

PMID: 33014324 Structure elucidation of the redox cofactor mycofactocin reveals oligo-glycosylation by MftF.

PMID: 31381312 MftD Catalyzes the Formation of a Biologically Active Redox Center in the Biosynthesis of the Ribosomally Synthesized and Post-translationally Modified Redox Cofactor Mycofactocin

PMID: 30778644 Occurrence, function, and biosynthesis of mycofactocin.

PMID: 30183269 Mycofactocin Biosynthesis Proceeds through 3-Amino-5-[( p-hydroxyphenyl)methyl]-4,4-dimethyl-2-pyrrolidinone (AHDP); Direct Observation of MftE Specificity toward MftA.

PMID: 27312813 Mycofactocin biosynthesis: modification of the peptide MftA by the radical S-adenosylmethionine protein MftC

PMID: 21223593 Bioinformatic evidence for a widely distributed, ribosomally produced electron carrier precursor, its maturation proteins, and its nicotinoprotein redox partners.

Mycofactocin biosynthesis occurs in bacteria (especially Mycobacterium tuberculosis and related species) and in archaea.

This New Term Request should create GO term closely analogous to GO:0018189 pyrroloquinoline quinone biosynthetic process. Some parallels are:

In other ways, the cofactor is analogous to F420. The glycosylation state is variable, and "mycofactocin", like "F420", refers to the family of compounds with the redox-active core but changing sizes for the oligosaccharide attachment.

The term will be attached (eventually) to the following protein family-defining HMMs

mftA TIGR03969 mftB TIGR03967 mftC TIGR03962 mftD TIGR03966 mftE TIGR03964 mftF TIGR03965

from the TIGRFAMs collection, which is now maintained as part of the Protein Family Models collection at NCBI.

https://www.ncbi.nlm.nih.gov/genome/annotation_prok/evidence/TIGR03969 https://www.ncbi.nlm.nih.gov/genome/annotation_prok/evidence/TIGR03967 etc.

thomaspd commented 3 years ago

This looks straightforward to add, if we use the term "premycofactocin biosynthetic process", as there's a CHEBI ID for premycofactocin (CHEBI:150862), which looks to me from https://pubs.rsc.org/en/content/articlelanding/2020/sc/d0sc01172j#!divAbstract to be the small molecule in its unglycosylated form. From Dan's chemical classification above, I think that's what he means. @danielhhaft would that be OK?

danielhhaft commented 3 years ago

I don't really like the idea. The biological process creates the final form, mycofactocin, which is now understood to be a mix of forms having different numbers of sugars attached. There is decent precedence for this in GO. "gentamicin" (which GO has under the incorrect or at least much less preferred spelling "gentamycin") is a mix of forms because a couple positions in the structure are naturally variable. Coenzyme F420 has a similar story, synthesized with a poly-gamma-glutamate tail that varies in length.

I appreciate there is a CHEBI ID for unglycosylated form, but if the penultimate compound in tryptophan biosynthesis had such a CHEBI ID and tryptophan itself happened not to, I would still say that the name of the biological process is "tryptophan biosynthesis".

The study that found multiple oligo-glycosylated forms mycofactocin differed from previous studies in being a metabolomics study of molecules found in living cells. By contrast, previous studies did in vitro work characterizing chemical products from partially reconstructed pathways. The metabolomics study found some glycosylated AHDP, raising the possibility that physiological pathways might feature MftF acting before MftD, at least some of the time. This means the compound now called premycofactocin is not absolutely guaranteed to be an intermediate in the production of the mature form (with its mixture of glycosylation states).

Since exhaustive searches of bacterial genomes never find a (pre)mycofactocin biosynthesis cassette lacking MftF (the glycosyltransferase), I see no value in defining the biological process for GO as one that stops a step short of what is physiological - a mycofactocin pool that has some variability in glycosylation state and that interacts with a variety of enzymes. Mycofactocin is real, and proven, and has variable glycosylation, and all proteins MftA through MftF have been shown to be part of the process. Why not simply allow "mycofactocin biosynthesis" to be the name of the process in GO?

alanbridge commented 3 years ago

In that case we could request a new ChEBI class that groups glycosylated forms and define the pathway with reference to that? We now have a few enzymes in UniProt with Rheas, but there doesn't seem to be a relevant class for the final glyco product, will follow up with curators.

https://www.uniprot.org/uniprot/?query=mycofactocin&fil=reviewed%3Ayes&sort=score

https://www.rhea-db.org/rhea?query=mycofactocin

pgaudet commented 3 years ago

Thanks @alanbridge, that would be much appreciated.

@danielhhaft we dont need a CHEBI term, but it's much more preferable to have one if we can (term can be requested here: https://www.ebi.ac.uk/chebi/submission)

Thanks, Pascale.

pgaudet commented 3 years ago

I will create the term - I will not add 'Is_a: peptidyl-tyrosine modification (GO:0018212)' - this is a conversion, not a modification. I also removed it from GO:0018189 pyrroloquinoline quinone biosynthetic process.

pgaudet commented 3 years ago

+[Term] +id: GO:0140604 +name: mycofactocin biosynthetic process +def: "The chemical reactions and pathways resulting in the formation of the coenzyme mycofactocin, a variably glycosylated small molecule electron pair carrier derived from the C-terminal valine-tyrosine dipeptide of the ribosomally translated precursor peptide MftA." [PMID:21223593, PMID:27312813, PMID:30183269, PMID:30778644, PMID:31381312, PMID:33014324] +is_a: GO:0018130 ! heterocycle biosynthetic process +is_a: GO:0044271 ! cellular nitrogen compound biosynthetic process +is_a: GO:1901566 ! organonitrogen compound biosynthetic process +property_value: term_tracker_item https://github.com/geneontology/go-ontology/issues/20547 xsd:anyURI +created_by: pg +creation_date: 2021-02-16T07:42:51Z

pgaudet commented 3 years ago

Also added only_in_taxon: GO:0140604 mycofactocin biosynthetic process NCBITaxon:Union_0000004 Prokaryota

Need to add the logical definition when the CHEBI term is created.

pgaudet commented 3 years ago

CHEBI:167637, mycofactocin

danielhhaft commented 3 years ago

It looks like this term is on its way to approval, but when I search GO resources, I still don't find it. Please help me understand the process better (as I imagine contributing more terms). How can I tell when a decision is finalized that the term will appear, and then how long should it take before searching GO sites by text will retrieve something?

hdrabkin commented 3 years ago

Hi @danielhhaft The term id: GO:0140604 +ame: mycofactocin biosynthetic process creation_date: 2021-02-16T07:42:51Z Usually, a created go term needs about a day to get into the official releases. and IS in the ontologies, both go-basic and go These are available at the go web site but these also have a relase date of 2021-02-01 http://geneontology.org/docs/download-ontology/ You could also look in the 'snapshot' files described on that page. http://purl.obolibrary.org/obo/go/snapshot/go.obo

NOte , it is not appearing in Amigo because the version used is the Last file loaded on 2021-02-02, (so missed by 1 day) The term IS available in QuickGO https://www.ebi.ac.uk/QuickGO/search/GO:0140604