MassBank / MassBank-web

The web server application and directly connected components for a MassBank web server
14 stars 22 forks source link

proposal for BioSchemas `MolecularEntity` annotation of compounds #139

Closed egonw closed 3 years ago

egonw commented 5 years ago

Last week the MolecularEntity got further extended, see https://github.com/elixir-europe/BioHackathon/tree/master/interoperability/Bioschemas

(please assign to me)

egonw commented 5 years ago

I'll use this page as example for MolecularEntity: https://msbi.ipb-halle.de/MassBank/RecordDisplay.jsp?id=NU000001&dsn=Nihon_Univ

tsufz commented 5 years ago

Do we need to change something in the Record Format (https://github.com/MassBank/MassBank-web/blob/master/Documentation/MassBankRecordFormat.md) in order get it startet? This would be the starting point for implementation.

egonw commented 5 years ago

No, I don't think so. I took NU000001 as an example, and this JSON is what can be added to the HTML <head> (using a properly crafted <script> element):

{
    "@context": "http://schema.org",
    "@type": "MolecularEntity",
    "name": "3alpha-Hydroxy-5alpha-cholan-24-oic acid Methyl ester",
    "alternateName": [
        "Allolithocholic Acid Methyl ester"
    ],
    "identifier": "NU000001",
    "url": "https://msbi.ipb-halle.de/MassBank/RecordDisplay.jsp?id=NU000001&dsn=Nihon_Univ",
    "molecularFormula": "C25H42O3",
    "monoisotopicMolecularWeight": 390.31340,
    "inChI": "InChI=1S/C25H42O3/c1-16(5-10-23(27)28-4)20-8-9-21-19-7-6-17-15-18(26)11-13-24(17,2)22(19)12-14-25(20,21)3/h16-22,26H,5-15H2,1-4H3/t16-,17+,18-,19+,20-,21+,22+,24+,25-/m1/s1",
    "smiles": ["C(C1(C)4)([H])(C(C)CCC(=O)OC)CCC1(C(C3([H])CC4)([H])CCC(C3(C)2)([H])CC(O)CC2)[H]"],
    "biologicalRole": [
        {
            "@type": "DefinedTerm",
            "@id": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C66892",
            "inDefinedTermSet":
                {
                    "@type":"DefinedTermSet",
                    "@id":"http://data.bioontology.org/ontologies/NCIT/submissions/69/download?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb",
                    "name": "National Cancer Institute Thesaurus"
                },
            "termCode": "C66892",
            "name": "natural product",
            "url": "http://bioportal.bioontology.org/ontologies/NCIT?p=classes&conceptid=http%3A%2F%2Fncicb.nci.nih.gov%2Fxml%2Fowl%2FEVS%2FThesaurus.owl%23C66892"
        }
  ]
}
egonw commented 5 years ago

BTW, last week we updated http://www.macs.hw.ac.uk/~ajg33/BioschemasGenerator/#2 to have the MolecularEntity generator, where you can fill out content and get the JSON-LD with <script> element that needs to end up in the HEAD.

How is the HTML generated?

tsufz commented 5 years ago

On the fly with JS,

tsufz commented 5 years ago

The major task is to classify the record according to the bioontology. We have the field compound class, but this is quite general without underlying ontologie (so far). The first step would be to evaluate the content of this field and then to think about how to standardise the field with some minimum ontologie / list of obligatory terms (e.g. Natural Product, Biotransformation Product, Environmental Standard).

meier-rene commented 5 years ago

Thank you @egonw , I will integrate this in the next release of the record result page. The html is generated from database content with a tomcat servlet.

egonw commented 5 years ago

@meier-rene, you probably seen this already, but you can test your HTML here: https://search.google.com/structured-data/testing-tool It may complain that it does not know the bioschemas entities, but it will at least test if the syntax is right...

tsufz commented 5 years ago

I would like bring forward that project. Who is keeping care of it? And in which timeframe?

egonw commented 5 years ago

I will come back on this shortly. For now, plz assign the issue to me: I'll take ownership of it.

egonw commented 4 years ago

I think annotation has been deployed now broadly, right?

Plz review this PR to get MassBank Europe listed as "live deployment": https://github.com/BioSchemas/bioschemas.github.io/pull/274

ping @tsufz @meier-rene @sneumann @schymane

tsufz commented 4 years ago

related to #94 and #132, #206 . Could be closed soon!

egonw commented 4 years ago

Thanks, awesome work, all!

tsufz commented 3 years ago

All done what to do so far.