linkml / schema-automator

Automated assistance for the schema development lifecycle
https://linkml.io/schema-automator/
BSD 3-Clause "New" or "Revised" License
34 stars 12 forks source link

prototype the OLS term search approach with the FAO soil classes in the MIxS (5?) model #6

Closed turbomam closed 1 year ago

turbomam commented 3 years ago

@mslarae13 I placed the results of this mapping exercise in a google sheet. There's a data dictionary in a seperate tab.

In this case, the mappings are very straightforward. I asked for 5 mappings from ENVO for each soil class. In most cases, there was a match that was identical, except for singular/plural mismatches. In a few other cases, there was no match at all. In those failure cases, there's still one row on the results table, but there's no information about any mappings, and the cosine distance is 1.0. That means that whatever they query was, it is totally different from the empty string that was returned.

Since I request the 5 best mappings, there could be more than one candidate mapping for each soil name from MIxS. If the candidate with rank=1 has a low cosine distance (<0.05) then there's no need to look at the mappings with higher ranks. They're really only useful for brainstorming when the rank 1 candidate mapping wasn't very good.

So one way of looking at the results would be filtering out any row with rank > 1.

This process can now be used for mapping any strings from a model like MIxS to terms form any ontology that is included in the EBI Ontology Lookup Service.

@cmungall and I have also experimented with using Zooma and the BioPortal Annotator as back ends. I feel that OLS will give the best mappings to high quality ontologies in almost all cases, but we could return to one of those other providers if necessary.

Please note that future mapping results will probably take the form of slightly different spreadsheets. We are generally using the SSSOM format for these mapping tasks.

turbomam commented 2 years ago

Deprecated by #35. Annotating FAO soil class enums works well, but some other enums have PVs that crash the annotator.

The documentation-like content above should be moved somewhere else and revised if necessary.

sierra-moxon commented 1 year ago

closing as deprecated.