Open skybristol opened 8 years ago
This is a good topic for discussion at a Telecon. @skybristol WDYT?
I think this "use case" looks more like a requirement. We've been working on fleshing out the use cases, and I think that in doing so, it is becoming more apparent that we have just a few (maybe three or four) actual use cases, which have some common requirements. For example, Lewis' use case about using the ontology portal to support search engines implies a requirement that terms (user input, as well as terms in the ontologies hosted on the portal) can be matched as closely related. A similar requirement is implied by the text annotation use case, which requires the portal to match terms in a text document to terms in ontologies hosted on the portal. In the latter case, at least, the user would have the option of accepting or rejecting a proposed match. I would propose that we not treat term matching as a use case, but rather treat it as a requirement in which terms can be matched for various purposes within some specific context, which users can have some control over.
I would be really cautious about the registry making an assumption that an exact string match between concept names indicates that the two concepts are equivalent. Take the simple example of a concept called "temperature." In the abstract sense, there is a fair amount of similarity between modeled temperature and measured temperature or temperature in fahrenheit and celsius but depending on the use of that relationship between concepts, it would be detrimental to assume that they are an exact match.
However, I think the notion of some form of automated proximity analysis and flagging should be a function of a community clearinghouse for ontologies. I would treat it as proximity though and establish a quantitative or at least qualitative predicate in the links between two concepts across disparate ontologies. Ideally, our "proximity analysis algorithm" would go beyond simple concept matching to look further at definitions and other attribution and context to get to a more precise approximation of how close two things might be in the registry. Adding provenance to the mix, we could record "PAA" actions and then augment them with human user assertions ("matches entered manually") of the relationship between two concepts (equivalence or otherwise) to reduce uncertainty about how closely two concepts from different ontologies align. Additionally, evidence of proximity could come from other sources outside the ontology registry itself, contributed by the work of other systems back to the community. We could treat these community contributions of proximity assertion, submitted as triples, as additional ontologies in the registry with a particular purpose of making connections between concepts. It would be useful for the repository itself to then monitor this accumulation of evidence to report with increasing certainty the proximity between concepts it knows about.