Open twhetzel opened 7 months ago
I had already reported this to Orphanet a while back: https://github.com/Orphanet/ORDO/issues/33
For now, @joeflack4, just remove all labels with @en on them I think?
Sure. This one is low priority, so I don't mind deferring discussion of this until later. Just trying to think through how I'd do this. Here are my initial thoughts.
I suppose what I'd do is add another SPARQL query to the component/ordo.owl
goal, where I delete all the labels that contain @
(that way it will catch all language labels that might incidentally slip in), or @en
.
Note to self: If I do this via a SPARQL query, then I'll probably want to update the script for #510 so that it also runs this query before running the Python script, for DRYness, rather than the way #510 is currently working, where it removes the @en
via Python code.
The labels with the @en
language tag can be removed from Ordo using this sparql query:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
DELETE {
?class rdfs:label ?label .
}
WHERE {
?class rdfs:label ?label .
FILTER(LANG(?label) = "en")
}
and will remove the english language tag labels from:
Oh cool I didn't even think / know about LANG(?label)
.
@twhetzel Looks like probably the only work needed then is to create this query and then add a single line to the component goal. It's marked as low priority but maybe this will take like 10 min, lemme know if / when you want me to handle it.
Based on the work to update the RD subset based on ORDO information, there may be some duplicate labels due to the presence of
@en
on some ORDO labels. See comment https://github.com/monarch-initiative/mondo-ingest/pull/510/files#r1588599159