Open cmungall opened 6 years ago
Would be nice to reuse the OBO prefixes context: http://obofoundry.org/registry/obo_context.jsonld
As far as I know, while a JSON document can reference multiple contexts, a context can't import another context. Should "single JSON-LD context" mean a single defined set of JSON-LD contexts, or do you want to have the pipeline concatenate a few source contexts into the single JSON-LD context?
I'm adding rdf_uri_prefix to db-xrefs yaml. Note this will often be different from the web page expansion. Currently these are all obolibrary or identifiers.org.
db-xrefs.yaml is the canonical source metadata for GO. We will generate a json-ld context from this as part of the release. Minerva will use this for expansion/contraction when communicating with Noctua/golr. ontobio will use this when converting GAFs to GO-CAMs. The neo build will use this to expand GPIs to make an OWL file of all the gene products.
Jim: currently there is only a handful of ontologies in here and these are just manually synced with obo_context. We have tools in the prefixcommons repo to detect inconsistencies between these.
Remaining issues:
Dipper's curie_prefix to base_iri mapping file is:
https://github.com/monarch-initiative/dipper/blob/master/dipper/curie_map.yaml
Monarch app should also use it although I am not sure it does everywhere it could.
curie_map.yaml
could also stand a shakedown for
Thanks Tom!
Summary of where we are in GO
https://github.com/geneontology/go-site/blob/master/metadata/db-xrefs.yaml is the source authority. See https://github.com/geneontology/go-site/pull/620/files
This is used to generate https://github.com/prefixcommons/biocontext/blob/master/registry/go_context.jsonld, but we'll actually publish the jsonld context as part of the GO pipeline.
The prefixcommons repo is a good place to go for getting diffs between any two contexts
Oh what a mess this is, prefix case differences, conflicting cases for uris, straight up prefix hijacking ... I'm sorry but I cannot not be taking this on right now.
This is a blocking issue for me on the GO-CAM site. For reference:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
#BIND(<http://identifiers.org/uniprot/Q9WTW1> as ?GP) .
BIND(<http://identifiers.org/uniprot/P34913> as ?GP) .
?GP ?pred ?obj .
}
LIMIT 10
Q9WTW1 (Rat) will have no information, just stating it is an owl:class P34913 (Human) will have some information (obo:id, rdfs:label)
Other cases:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
# BIND(<http://identifiers.org/uniprot/A8IV67> as ?gpuri) # has nothing
# BIND(<http://identifiers.org/uniprot/P10499> as ?gpuri) # just has ?obj = owl:Class
# BIND(<http://www.informatics.jax.org/accession/MGI:MGI:1316740> as ?gpuri) # has owl:Class, oboInOwl:id, rdf:type, rdfs:label
BIND(<http://identifiers.org/uniprot/P34913> as ?gpuri) # has possibly all information (dbxref, synonym, label, subclassOf, etc)
?gpuri ?pred ?obj .
}
LIMIT 10
Which affects more complex queries (e.g. to get the recommended name of a gene, or its taxon):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX metago: <http://model.geneontology.org/>
PREFIX enabled_by: <http://purl.obolibrary.org/obo/RO_0002333>
PREFIX in_taxon: <http://purl.obolibrary.org/obo/RO_0002162>
SELECT distinct ?identifier ?name ?species
WHERE
{
# GRAPH metago:586fc17a00000705 {
GRAPH metago:581e072c00000295 {
?s enabled_by: ?gpnode .
?gpnode rdf:type ?identifier .
FILTER(?identifier != owl:NamedIndividual) .
}
?identifier rdfs:subClassOf ?v0 .
?identifier rdfs:label ?name .
?v0 owl:onProperty in_taxon: .
?v0 owl:someValuesFrom ?taxon .
?taxon rdfs:label ?species .
}
this query works for the second model, but does not work for the first model (xxx705). In the first model, the ?identifier is referring to a flat class without any subclass ?v0
@lpalbou I don't think your problem relates to identifier prefixes. Q9WTW1
is simply not in NEO at all.
@cmungall @balhoff I believe that this is clear now?
@cmungall thanks !
Parts of our stack (amigo, noctua-js, GAFs, etc) use CURIEs/IDs as currency. Other parts (minerva, go-rdf, ontology) use URIs.
The expansion/contraction rules are not well defined.
We should have a single json-ld context file we use across the GO.
Furthermore, the contexts of this should be as predictable as possible. E.g. obolibrary for all ontologies, purl.uniprot for all uniprot entries, and something like id.org for everything else. This will require a one-time change to Noctua models.
Previous tickets: