mibig-secmet / mibig-json

Repository to track changes in MIBiG curation data stored in JSON format
6 stars 6 forks source link

Add metadata and cross-references for several BGC products #211

Closed althonos closed 2 years ago

althonos commented 2 years ago

Hi !

This PR fixes some annotations for the chemical products synthesized by several BGCs. I added cross-references to PubChem and extracted additional metadata from there when it was possible.

BGC0000231

BGC0000231 does not produce a single molecule named griseusin, but two related compounds named griseusin A and griseusin B, as described in PMID:8169211

image

BGC0000243 and BGC0000244

These BGCs were reported produce several molecules of the macrotetrolide family, but they were not detailed. I added the metadata for the 5 macrotetrolides produced naturally and described in PMID:10858335 (the reference paper).

image

BGC0000248

BGC0000248 was reported to produce naphtocyclinone, but the authors name it α-naphthocyclinone in the manuscript, and since then additional naphthocyclinones have been isolated (δ-naphtocyclinone, etc.)

image

BGC0000402

This BGC listed its product as paenilarvins, but in the reference manuscript authors have characterized three different molecules: paenilarvin A, paenilarvin B and paenilarvin C.

image

BGC0000662

This BGC listed its product as grixazone, but is actually grixazone A in PubChem. I also added a new reference for this cluster (PMID:17617696) which describes the biosynthetic pathway of grixazone A based on this cluster.

image

BGC0001167

This BGC listed piricyclamide as its product but it actually produces 4 different compounds according to PMID:22952627.

image

BGC0001268

According to the reference paper (PMID:23932525), the end product of the biosynthetic pathway encoded in this BGC is fusarin C.

image

BGC0001413

According to the reference paper (PMID:25510965), this BGC produces 3 cystobactamid products.

image

BGC0001465

The BGC product was listed as generic bromopyrroles/bromophenols, but the reference paper (PMID:24974229) gave a structure for the three naturally-occuring compounds without naming them explicitly.

image

I manually searched for these molecules based on the molecule structure in PubChem to get the corresponding compounds: bromophene, pentabromopseudilin and bistribromopyrrole.

BGC0001526

I fixed the name of the compounds (bartolosides A -> bartoloside A, etc.) and added cross-references to PubChem.

BGC0001620

According to the reference paper (PMID:28855504), this BGC leads to the production of 6 naturally occuring compounds (ilamycin E2 and ilamycin F were obtained by cluster engineering).

image

BGC0001644

According to the reference paper (PMID:30025185), this BGC produces two related compounds, lacunalide A and its desmethyl derivative lacunalide B:

image

BGC0001716

I added the compounds from PMID:29625040. It's a bit unclear how they should be named, in the article they are called odilorhabdin NOSO-95A or NOSO-95A with no consistency. In the end I used the longer name from PubChem.

image

BGC0001983

The paper reports 4 different triacsin compounds, but only triacsin C (refered as compound 3) was found:

UV traces (300 nm) confirming the production of 3 in S. tsukubaensis. In both strains, the congener 3 was produced as the major product and other congeners were not identifiable by UV in these traces.

nihms-1036726-f0002

BGC0002019

The reference paper (PMID:29806086) describes 11 tiancilactone molecules:

image

However, based on the metabolite profile of the strain fermentation extract, only 8 of them are being produced by Streptomyces sp. CB03234, so I only added these 8 compounds to the BGC products.

BGC0002021

Rename fogacin A to fogacin, which is named as such in PubChem and in the reference paper (PMID:30556239), despite the derivatives being called fogacin B and fogacin C.

BGC0001216

The reference paper (PMID:25763681) describes 4 different splenocin molecules, but in the manuscript text the authors say they only observe production of splenocin C:

In our hands, we only observe SPN-C in the fermentation of CNQ431 [...]

althonos commented 2 years ago

(@SJShaw : I've fixed the JSON violations, would be great if you could approve Actions once more :pray:)

kblin commented 2 years ago

Replaced by #232