Closed jamesamcl closed 1 year ago
This relates to more than RDFS. It also relates to for example SKOS and Schema.org. See OLS3 issue #503. I actually like the suggestion by @cmungall to keep this separate from the OLS code base. That is we have separate program(s?) to do translation to from whatever to OWL RDF which they can run in their upstream pipelines. I think that will help to keep the OLS code base simple.
There are some advantages to this:
rdfs:subClassOf
vs skos:broader
or rdfs:domain
vs schema:domainIncludes
etc. /ontologies
vs /vocabularies
. Indeed, as there is no reasoning on the ontologies, we in fact deal with them as if they are vocabularies. Everyone uses the existing API. I can see that can also avoid philosophical and emotional debates about whether someone's vocabulary/ontology is accessible under /ontologies
rather than /vocabularies
, or vice versa.@henrietteharmse I dont understand exactly what this means for SKOS vocabs - do you mean supporting SKOS by basically having a translation process that projects skos concepts as owl:Class and skos assertions as annotation property assertions between classes (or object property)?
Yes, that is more-or-less the idea. The idea is to enable similar rendering and functionality for SKOS (and other vocabularies) as for OWL in OLS. I.e., retrieve terms, navigate hierarchy. For this my initial thoughts are we will need to cater for:
These are my initial thoughts. There may be more aspects to consider.
The problem with translating is that the whole point of OLS is to display exactly what is there. If we add a translation step to OWL, we (as OLS developers) are changing/adding information to a vocabulary that does not belong to us before publishing it.
Or another way to put it: what you see in OLS will no longer be a faithful representation of what is provided directly by the upstream authors. Once we start doing this we are playing a very different role in the community. We would be forcing OWL, via a downstream third-party translation, onto vocabularies for which the upstream developers have specifically chosen NOT to use OWL.
I think the position of OLS in the community should be to make ontologies more easily accessible regardless of how they are represented - not to die on a hill of trying to convert the entire world to the OWL2 spec which we know as well as anyone has plenty of its own issues.
I don't think it would simplify the OLS4 codebase enough to justify the huge amount of additional complication of the DATA that would arise from adding a translation layer. And IMO the quality of the data is the number 1 priority of OLS and more important than some presumed complexity of code that has not yet been designed or implemented. All of these standards we are considering are much, much simpler than OWL2.
There is no need in the API to distinguish between /ontologies vs /vocabularies. Indeed, as there is no reasoning on the ontologies, we in fact deal with them as if they are vocabularies. Everyone uses the existing API. I can see that can also avoid philosophical and emotional debates about whether someone's vocabulary/ontology is accessible under /ontologies rather than /vocabularies, or vice versa.
I don't have strong feelings about this particular aspect at all; happy for it to be all under the same API if we're happy to call everything an ontology.
This is not about pushing OWL. It is about having a data upload standard, which is a common practice. Hence the reason why most resources publish CSV/spreadsheet/XML formats in which data must be uploaded.
Why do they do that? To limit the amount of code that needs to be written to cater for different formats and thus reduce maintenance effort.
I am not as familiar with the code as you are. Hence, you may be able to see ways to mitigate my concerns which I don't:
The typical way I am aware of avoiding these concerns, it is to do some normalization to a standard and from there on to use the standard downstream. If you can have the standard implicit in the code and 1 place to cater for RDFS, 1 place for SKOS, 1 place for Schema.org, etc - that could be doable.
This is not about pushing OWL. It is about having a data upload standard, which is a common practice. Hence the reason why most resources publish CSV/spreadsheet/XML formats in which data must be uploaded. Why do they do that? To limit the amount of code that needs to be written to cater for different formats and thus reduce maintenance effort.
Yes but we cannot do that because we are indexing ontologies that already exist. We can't ask the upstream developers to convert their ontologies to OWL when they have already used a different standard. As you say we would need to automatically convert - but that would be us modifying data that isn't ours and publishing the results, which is a departure from what OLS currently does (simply displaying it faithfully exactly as it was published).
Regarding your other points, we already extensively map OWL properties to OLS properties (e.g. rdfs:subClassOf
and rdfs:subPropertyOf
to parent
, the same + defined hierarchical properties to hierarchicalParent
, rdfs:label
and other label properties to label
). These are mostly what the frontend ends up using. (When I started out with OLS4 I wanted to avoid creating any such OLS properties and use OWL2 as our database representation, but it was impossible to reimplement the OLS3 API that way. However with the issue at hand this may have been fortunate.)
For example this is what EFO:0000400
looks like internally. Every property with a simple field name instead of an IRI is something OLS added.
{
"appearsIn" : [ "efo", "clo", "bcgo" ],
"curie" : "EFO:0000400",
"definedBy" : [ "efo" ],
"definition" : [ "A heterogeneous group of disorders characterized by HYPERGLYCEMIA and GLUCOSE INTOLERANCE.", {
"type" : [ "reification" ],
"value" : "A metabolic disorder characterized by abnormally high blood sugar levels due to diminished production of insulin or insulin resistance/desensitization.",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : "NCIT:P378"
} ]
} ],
"definitionProperty" : [ "http://purl.obolibrary.org/obo/IAO_0000115", "http://www.ebi.ac.uk/efo/definition", "http://www.w3.org/2000/01/rdf-schema#description" ],
"directAncestor" : [ "http://purl.obolibrary.org/obo/MONDO_0001933", "http://www.ebi.ac.uk/efo/EFO_0009605", "http://www.ebi.ac.uk/efo/EFO_0000405", "http://www.ebi.ac.uk/efo/EFO_0000408", "http://purl.obolibrary.org/obo/BFO_0000016", "http://purl.obolibrary.org/obo/BFO_0000020", "http://www.ebi.ac.uk/efo/EFO_0000001", "http://www.w3.org/2002/07/owl#Thing", "http://www.ebi.ac.uk/efo/EFO_0001379", "http://www.ebi.ac.uk/efo/EFO_0009406", "http://www.ebi.ac.uk/efo/EFO_0000589" ],
"directParent" : [ {
"type" : [ "reification" ],
"value" : "http://purl.obolibrary.org/obo/MONDO_0001933",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : "NCIT:C2985"
} ]
}, {
"type" : [ "reification" ],
"value" : "http://www.ebi.ac.uk/efo/EFO_0009406",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : [ "DOID:9351", "MESH:D003920", "NCIT:C2985" ]
} ]
} ],
"hasDirectChildren" : "true",
"hasDirectParent" : "true",
"hasHierarchicalChildren" : "true",
"hasHierarchicalParent" : "true",
"hierarchicalAncestor" : [ "http://purl.obolibrary.org/obo/MONDO_0001933", "http://www.ebi.ac.uk/efo/EFO_0009605", "http://www.ebi.ac.uk/efo/EFO_0000405", "http://www.ebi.ac.uk/efo/EFO_0000408", "http://purl.obolibrary.org/obo/BFO_0000016", "http://purl.obolibrary.org/obo/BFO_0000020", "http://www.ebi.ac.uk/efo/EFO_0000001", "http://www.w3.org/2002/07/owl#Thing", "http://www.ebi.ac.uk/efo/EFO_0001379", "http://www.ebi.ac.uk/efo/EFO_0009406", "http://www.ebi.ac.uk/efo/EFO_0000589" ],
"hierarchicalParent" : [ {
"type" : [ "reification" ],
"value" : "http://purl.obolibrary.org/obo/MONDO_0001933",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : "NCIT:C2985"
} ]
}, {
"type" : [ "reification" ],
"value" : "http://www.ebi.ac.uk/efo/EFO_0009406",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : [ "DOID:9351", "MESH:D003920", "NCIT:C2985" ]
} ]
} ],
"hierarchicalProperty" : [ "http://purl.obolibrary.org/obo/BFO_0000050", "http://purl.obolibrary.org/obo/RO_0002202", "http://www.w3.org/2000/01/rdf-schema#subClassOf" ],
"imported" : "false",
"iri" : "http://www.ebi.ac.uk/efo/EFO_0000400",
"isDefiningOntology" : true,
"isObsolete" : "false",
"isPreferredRoot" : "false",
"label" : "diabetes mellitus",
"linkedEntities": { ... },
"numDirectDescendants" : "43",
"numHierarchicalDescendants" : "43",
"ontologyId" : "efo",
"ontologyIri" : "http://www.ebi.ac.uk/efo/efo.owl",
"ontologyPreferredPrefix" : "EFO",
"relatedFrom" : [ {
"property" : "http://purl.obolibrary.org/obo/RO_0004022",
"value" : "http://purl.obolibrary.org/obo/MONDO_0000489",
"type" : [ "related" ],
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type" : "http://www.w3.org/2002/07/owl#Restriction",
"http://www.w3.org/2002/07/owl#onProperty" : "http://purl.obolibrary.org/obo/RO_0004022",
"http://www.w3.org/2002/07/owl#someValuesFrom" : "http://www.ebi.ac.uk/efo/EFO_0000400",
"isObsolete" : "false"
}, [...] ],
"relatedTo" : {
"property" : "http://purl.obolibrary.org/obo/BFO_0000054",
"value" : "http://purl.obolibrary.org/obo/OGMS_0000063",
"type" : [ "related" ],
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type" : "http://www.w3.org/2002/07/owl#Class",
"http://www.w3.org/2002/07/owl#intersectionOf" : [ "http://purl.obolibrary.org/obo/OGMS_0000063", {
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type" : "http://www.w3.org/2002/07/owl#Restriction",
"http://www.w3.org/2002/07/owl#onProperty" : "http://purl.obolibrary.org/obo/BFO_0000051",
"http://www.w3.org/2002/07/owl#someValuesFrom" : "http://purl.obolibrary.org/obo/GO_0008152",
"isObsolete" : "false"
} ]
},
"shortForm" : "EFO_0000400",
"synonym" : [ {
"type" : [ "reification" ],
"value" : "DM",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : "NCIT:C2985",
"http://www.geneontology.org/formats/oboInOwl#hasSynonymType" : "http://purl.obolibrary.org/obo/mondo#ABBREVIATION"
} ]
}, "DM - Diabetes mellitus", "Diabetes", "Diabetes NOS", "Diabetes mellitus (disorder)", "Diabetes mellitus, NOS", {
"type" : [ "reification" ],
"value" : "diabetes",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : "NCIT:C2985"
} ]
}, {
"type" : [ "reification" ],
"value" : "diabetes mellitus",
"axioms" : [ {
"http://www.w3.org/2000/01/rdf-schema#comment" : "preferred label from MONDO"
}, {
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : [ "MONDO:ambiguous", "NCIT:C2985" ]
} ]
}, {
"type" : [ "reification" ],
"value" : "diabetes mellitus (disease)",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : "https://orcid.org/0000-0002-6601-2165"
} ]
} ],
"synonymProperty" : [ "http://www.ebi.ac.uk/efo/alternative_term", "http://www.geneontology.org/formats/oboInOwl#hasExactSynonym" ],
"type" : [ "class", "entity" ],
"http://purl.obolibrary.org/obo/IAO_0000115" : [ "A heterogeneous group of disorders characterized by HYPERGLYCEMIA and GLUCOSE INTOLERANCE.", {
"type" : [ "reification" ],
"value" : "A metabolic disorder characterized by abnormally high blood sugar levels due to diminished production of insulin or insulin resistance/desensitization.",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : "NCIT:P378"
} ]
} ],
"http://purl.obolibrary.org/obo/IAO_0000117" : "James Malone",
"http://purl.obolibrary.org/obo/IAO_0000233" : "https://github.com/monarch-initiative/mondo/issues/5723",
"http://purl.obolibrary.org/obo/IAO_0000589" : "diabetes mellitus (disease)",
"http://purl.obolibrary.org/obo/mondo#exactMatch" : [ "http://identifiers.org/mesh/D003920", "http://identifiers.org/snomedct/73211009", "http://linkedlifedata.com/resource/umls/id/C0011847", "http://linkedlifedata.com/resource/umls/id/C0011849", "http://purl.bioontology.org/ontology/ICD10CM/E08-E13", "http://purl.obolibrary.org/obo/DOID_9351", "http://purl.obolibrary.org/obo/NCIT_C2985" ],
"http://www.ebi.ac.uk/efo/gwas_trait" : "true",
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : [ {
"type" : [ "reification" ],
"value" : "DOID:9351",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : [ "EFO:0000400", "MONDO:equivalentTo" ]
} ]
}, {
"type" : [ "reification" ],
"value" : "HP:0000819",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : "MONDO:otherHierarchy"
} ]
}, "ICD10:E13", "ICD10:E14", {
"type" : [ "reification" ],
"value" : "ICD10CM:E08-E13",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : [ "DOID:9351", "MONDO:equivalentTo", "https://github.com/monarch-initiative/mondo/issues/4536", "https://orcid.org/0000-0001-5208-3432", "https://orcid.org/0000-0002-4142-7153" ]
} ]
}, {
"type" : [ "reification" ],
"value" : "ICD9:250",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : [ "DOID:9351", "EFO:0000400" ]
} ]
}, {
"type" : [ "reification" ],
"value" : "MESH:D003920",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : [ "DOID:9351", "EFO:0000400", "MONDO:equivalentTo" ]
} ]
}, "MONDO:0005015", "MeSH:D003920", "MedDRA:10012601", "MedDRA:10012624", "MedDRA:10012625", {
"type" : [ "reification" ],
"value" : "NCIT:C2985",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : [ "DOID:9351", "EFO:0000400", "MONDO:equivalentTo" ]
} ]
}, "NCIt:C2985", "OMIM:612227", {
"type" : [ "reification" ],
"value" : "SCTID:73211009",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : [ "DOID:9351", "EFO:0000400", "MONDO:equivalentTo" ]
} ]
}, "SNOMEDCT:73211009", {
"type" : [ "reification" ],
"value" : "UMLS:C0011847",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : "MONDO:equivalentTo"
} ]
}, {
"type" : [ "reification" ],
"value" : "UMLS:C0011849",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : [ "DOID:9351", "MONDO:equivalentTo", "NCIT:C2985" ]
} ]
} ],
"http://www.geneontology.org/formats/oboInOwl#hasExactSynonym" : [ {
"type" : [ "reification" ],
"value" : "DM",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : "NCIT:C2985",
"http://www.geneontology.org/formats/oboInOwl#hasSynonymType" : "http://purl.obolibrary.org/obo/mondo#ABBREVIATION"
} ]
}, "DM - Diabetes mellitus", "Diabetes", "Diabetes NOS", "Diabetes mellitus (disorder)", "Diabetes mellitus, NOS", {
"type" : [ "reification" ],
"value" : "diabetes",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : "NCIT:C2985"
} ]
}, {
"type" : [ "reification" ],
"value" : "diabetes mellitus",
"axioms" : [ {
"http://www.w3.org/2000/01/rdf-schema#comment" : "preferred label from MONDO"
}, {
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : [ "MONDO:ambiguous", "NCIT:C2985" ]
} ]
}, {
"type" : [ "reification" ],
"value" : "diabetes mellitus (disease)",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#hasDbXref" : "https://orcid.org/0000-0002-6601-2165"
} ]
} ],
"http://www.geneontology.org/formats/oboInOwl#id" : "EFO:0000400",
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type" : "http://www.w3.org/2002/07/owl#Class",
"http://www.w3.org/2000/01/rdf-schema#label" : "diabetes mellitus",
"http://www.w3.org/2000/01/rdf-schema#subClassOf" : [ {
"type" : [ "reification" ],
"value" : "http://purl.obolibrary.org/obo/MONDO_0001933",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : "NCIT:C2985"
} ]
}, {
"type" : [ "reification" ],
"value" : "http://www.ebi.ac.uk/efo/EFO_0009406",
"axioms" : [ {
"http://www.geneontology.org/formats/oboInOwl#source" : [ "DOID:9351", "MESH:D003920", "NCIT:C2985" ]
} ]
}, {
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type" : "http://www.w3.org/2002/07/owl#Restriction",
"http://www.w3.org/2002/07/owl#onProperty" : "http://purl.obolibrary.org/obo/BFO_0000054",
"http://www.w3.org/2002/07/owl#someValuesFrom" : {
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type" : "http://www.w3.org/2002/07/owl#Class",
"http://www.w3.org/2002/07/owl#intersectionOf" : [ "http://purl.obolibrary.org/obo/OGMS_0000063", {
"http://www.w3.org/1999/02/22-rdf-syntax-ns#type" : "http://www.w3.org/2002/07/owl#Restriction",
"http://www.w3.org/2002/07/owl#onProperty" : "http://purl.obolibrary.org/obo/BFO_0000051",
"http://www.w3.org/2002/07/owl#someValuesFrom" : "http://purl.obolibrary.org/obo/GO_0008152",
"isObsolete" : "false"
} ]
},
"isObsolete" : "false"
} ]
}
So what we have in our internal data model is an OLS abstraction layer, combined with raw OWL RDF triples, and the backend/frontend use some combination of these (mostly the abstraction layer) to return the API result/display the page.
Non-OWL ontologies would map to this same abstraction layer where possible. Where we use OWL directly in the API/frontend, we would add to the OLS abstraction layer, so the majority of the API/frontend would be agnostic to whether the ontology was originally OWL or anything else.
So for example with SKOS:
skos:Concept
to owl:Class
, we would just map skos:Concept
to an OLS entity with class
in its type
field: basically the exact same handling we have for owl:Class
already!skos:broader
or skos:narrower
to rdfs:subClassOf
, we would just use them to populate the parent
and hierarchicalParent
properties. Again this is exactly what we do with OWL hierarchical properties.or with RDFS:
rdfs:Class
maps to an OLS entity with class
in its type
fieldrdfs:subClassOf
is already handled by OLSI think you are imagining that OLS4 after the dataload uses a lot more OWL2 directly than it actually does. In truth most of the OWL2 properties in the entities are just duplicates of the abstracted OLS ones and could probably be omitted from our data representation without any negative effects (other than making it impossible to query them directly or to reconstruct the original OWL from the OLS entity).
Excellent @udp this clarifies the situation a lot for me. In this case I would 100% recommend a specific parser for skos into the internal Model since it does not add another point of failure to the rest of the system! Users will rejoice at first hand skos Support!
RDFS is done - it was only a few hours work:
Loading rdfs, dcterms, etc. will fix the issues with full IRIs being displayed instead of labels for some properties.
This leaves SKOS and schema.org for which I will open separate tickets.
This is crrrrraaayzzzzy!!! Wow! :) coool
It means I misunderstood something I think, I thought we were talking about skos represented vocabularies not the vocabularies themselves (very meta)
269 was about replacing IRIs with their labels. OLS4 does this in the linker; if a property is defined in an ontology, it uses the label of the property instead of the IRI.
However, some properties aren't defined by ontologies, e.g.
rdfs:domain
andrdfs:range
. We could have hardcoded mappings to string labels "domain" and "range", but instead I added support to the linker to load arbitrary RDF files (e.g. https://www.w3.org/2000/01/rdf-schema in this case) and import the labels of the classes. This will allow us to add other RDF files if there are other non ontologically defined vocabularies used by the ontologies.But this is still a hack. I think the real solution would be to load RDF vocabularies just like we load ontologies, as part of the dataload. RDF vocabularies define terms with IRIs, and have a hierarchy using
rdfs:subClassOf
which could be rendered as a tree of properties - so why not treat them as first class citizens? (e.g. you would actually be able to look at RDFS in OLS and browse its entities.)Our API currently looks something like:
/ontologies
/ontologies/<id>/classes
/ontologies/<id>/properties
/ontologies/<id>/individuals
For RDF hierarchies that aren't ontologies, we could add a parallel API e.g.:
/vocabularies
/vocabularies/<id>/classes
/vocabularies/<id>/properties
(Not at all sure about the "vocabularies" nomenclature but it's irrelevant at this point)
We would even be able to load OWL2 itself, which is defined using RDFS. And also any semantic data models e.g. from LinkML. They're not always strictly ontologies so it's slightly off-brand for OLS, but the lines are already so blurred with many ontologies like DUO just being hierarchical dictionaries, and it would allow OLS to provide a much more complete view of the semantic landscape: currently we can resolve IRIs defined by OWL ontologies, but there are loads of IRIs we can't resolve because they are semantic web IRIs but not ontology IRIs.
The technical implementation would be pretty simple because OLS4 is ultimately an RDF tool anyway. All of the OWL stuff has been implemented over RDF data models and we preserve the RDF in the databases. It would also be a good step towards generifying the dataload to handle other ontology models like SKOS.