Closed cmungall closed 3 years ago
@cmungall : Would the metabolism column need to run through OGER to get tagged by ecocore CUIs ?
I would not use oger here. There are 8 distinct values - it should take a few minutes to manually curate a mapping table that can be used. you may need to request any terms missing from ecocore
@realmarcin noted that there is an ontology in bioportal that has the terms we need:
This ontology is an (abandoned?) translation of MIxS to OWL.
Note that in NMDC @wdduncan is working on a semantic translation of MIxS. We use biolinkML as the representation. See https://github.com/microbiomedata/nmdc-metadata/blob/master/schema/mixs.yaml
E.g.
However, in MIxS the list of possible values is just an enumeration of strings.
With @kaiam we were originally going to be working on mapping each value in that enum - I would still like to prioritize this for key fields
we are reliant on this feature to bring it in to our yaml (and Kai's mapping) https://github.com/biolink/biolinkml/issues/170
However, for kg-microbe we can proceed independently of mixs, but we still need the mapping of strings to ecocore terms
This is what I have found thus far:
anaerobe: http://purl.obolibrary.org/obo/ECOCORE_00000172 anaerobic respiration: http://purl.obolibrary.org/obo/GO_0009061 facultative anaerobe: http://purl.obolibrary.org/obo/OMP_0000087 obligately anaerobic: http://purl.obolibrary.org/obo/MICRO_0000504
aerobe: http://purl.obolibrary.org/obo/ECOCORE_00000173 aerobic respiration: http://purl.obolibrary.org/obo/GO_0009060 obligately aerobic: http://purl.obolibrary.org/obo/MICRO_0000516
Could not find 'microaerophilic'. Would 'strictly anaerobic' and 'anaerobic' be clubbed together?
This is a really good example of why we need more harmonization in OBO! Ideally there would be one ontology to use here, not 4!
I would not use oger here. There are 8 distinct values - it should take a few minutes to manually curate a mapping table that can be used. you may need to request any terms missing from ecocore
Here's what the mapping table looks like:
ID | ActualTerm | PreferredTerm |
---|---|---|
ECOCORE:00000172 | anaerobic | anaerobe |
MICRO:0000504 | obligate anaerobic | obligately anaerobic |
OMP_0000087 | facultative | facultative anaerobe |
MICRO:0000516 | obligate aerobic | obligately aerobic |
ECOCORE:00000173 | aerobic | aerobe |
MICRO:0000515 | microaerophilic | microaerophilic |
'ActualTerm' is what exists in the data and 'PreferredTerm' is what exists in the corresponding ontology.
I found microaerophilic here: https://www.ebi.ac.uk/ols/ontologies/micro/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FMICRO_0000515
microaerophilic -> http://purl.obolibrary.org/obo/MICRO_0000515
Updated the table. Thanks @wdduncan !
Split from #2
1 strictly anaerobic 342 obligate anaerobic 544 microaerophilic 1035 obligate aerobic 2328 anaerobic 2655 facultative 3108 aerobic 4250 NA
I think these should map to ecocore, cc @diatomsRcool