OpenBEL / resource-generator

Python modules to generate BEL resource documents.
Apache License 2.0
0 stars 4 forks source link

incorporate tests for rdf output #46

Open ncatlett opened 10 years ago

ncatlett commented 10 years ago

Equivalence and orthology relationships may point to 'orphan' uris - i.e., uris that are created by the equivalence/orthology relationship and do not exist in the graph. Need to incorporate tests and/or fixes for these.

Return # of 'orphan' uris created by equivalence relationships:

select (count(distinct ?uri2) as ?count) where { ?uri1 skos:exactMatch ?uri2 . minus { ?uri2 skos:inScheme ?scheme .}
}

Return # of 'orphan' uris created by orthology relationships:

select (count(distinct ?uri2) as ?count) where { ?uri1 belv:orthologousMatch ?uri2 . minus { ?uri2 skos:inScheme ?scheme .}
}

abargnesi commented 10 years ago

rdflib provides SPARQL query and update against its own graph model. To issue SPARQL calls to a remote triplestore we can use RDFLib/sparqlwrapper.

ncatlett commented 10 years ago

Most of the orphan equivalences (~18,000) are from EntrezGene; these appear to be primarily mappings to MGI feature types "DNA segment" and "complex/cluster/region" (currently only Gene and Pseudogene are part of namespace)

~700 are from Affymetrix mappings; these are references to withdrawn EGIDs

Other orphan equivalences from:

  1. equivalence mapping uses alt_id (these could be fixed if identified)
  2. equivalence mapping uses withdrawn/obsolete term.
ncatlett commented 10 years ago

Also - test for identifiers and prefLabels that are not unique within a concept scheme/namespace

prefix belv: http://www.openbel.org/vocabulary/ prefix skos: http://www.w3.org/2004/02/skos/core# prefix namespace: http://www.openbel.org/bel/namespace/ prefix dc: http://purl.org/dc/terms/

select(count(distinct ?uri2) as ?count) where { ?uri1 dc:identifier ?id1 . ?uri2 dc:identifier ?id1 . ?uri1 skos:inScheme ?scheme . ?uri2 skos:inScheme ?scheme . FILTER (?uri1 != ?uri2) .
}

select(count(distinct ?uri2) as ?count) where { ?uri1 skos:prefLabel ?label . ?uri2 skos:prefLabel ?label . ?uri1 skos:inScheme ?scheme . ?uri2 skos:inScheme ?scheme . FILTER (?uri1 != ?uri2) .

}