ontodev / robot

ROBOT is an OBO Tool
http://robot.obolibrary.org
BSD 3-Clause "New" or "Revised" License
262 stars 74 forks source link

Export: Expected behavior for term with multiple rdfs:label? #1162

Closed dlutz2 closed 11 months ago

dlutz2 commented 11 months ago

ROBOT 1.9.4 - Export command For terms which have multiple rdfs:label annotations, export apparently selects one (arbitrarily?) and generates a warning but otherwise ignores the others. Same behavior using either "LABEL" or "rdfs:label" as the header. Many of our ontologies have multiple rdfs:labels in different languages, making export difficult to use. Could the behavior be changed such that if the header uses "Label" the current single value logic is used but if "rdfs:label" is used it is treated as any other annotation property and can then output multiple annotation values? The change would happen somewhere around: https://github.com/ontodev/robot/blob/2f9eca4e9d190029aff70de264a0cee2c024e7ea/robot-core/src/main/java/org/obolibrary/robot/ExportOperation.java#L222

jamesaoverton commented 11 months ago

The bigger problem is that ROBOT (and most OBO tooling) assumes that a term has a single rdfs:label. Changing that, even just in this case, would break all sorts of existing behaviour and the workflows that depend on it. We should have been smarter about that from the start, but now I think it's too late to change.

A workaround is to remove other labels before running export. I would probably do this using SPARQL and robot query --update.

A somewhat better solution would be a ROBOT command that picks one label, using a configurable strategy such as "prefer @en-US, then @en-UK, then @en". I'm not sure how much variation there would be.

If anyone has a better suggestion, let us know.

dlutz2 commented 11 months ago

A no-code-change-needed solution is always preferred, so sliding all but 1 rdfs:label over to e.g. skos:prefLabel using rewrite/SPARQL before export works. Need to check to see how to do a rewrite that can recognize the language tag of an annotation. (robot update?- is there an "update" command?) Ranked preference decision logic can get messy, since it would require thought on "preference for display","preference for resolve-by-label", if no-preference-seen... and still requires handling of multiple same-preference labels. Perhaps best to tag this as too hard for now. I'll close if no one objects, although it may come back in a different form if I can't figure out how to do the rewrite upstream of export. Maybe it goes on the list for a general purpose axiom transformer command.

Out of curiosity, does OBO have an opinion on the use/non-use of language tags? We use them a lot and we plan to use ROBOT a lot, so we'll have to adjust accordingly.

jamesaoverton commented 11 months ago

Sorry I meant robot query --update which uses SPARQL UPDATE -- I'll edit my comment above to fix that.

Language tags are good, but OBO has not done a good job of supporting them. We want to do better, it's just hard.