andredekker / BigMachine

2 stars 0 forks source link

Cancer MRN RDF graph to Token RDF graph #5

Open andredekker opened 8 years ago

andredekker commented 8 years ago

As a cancer data administrator, I need to convert my MRNs RDF graph to a token RDF graph and make this available for UHN users, so that this data is de-identified and I am allowed to give a researcher access to it.

PREFIX roo:<http://www.cancerdata.org/roo/>
PREFIX ncit:<http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>

INSERT {?token a ncit:C16960. } WHERE 
{
    SERVICE <http://localhost:9999/bigdata/namespace/MRN/sparql> 
            {
              _:cardiacPatient roo:100042 ?mrn.
            }
    SERVICE <http://localhost:9999/bigdata/namespace/MRN2Token/sparql> 
            {
              _:mrnToken roo:100042 ?mrn.
              ?token roo:100318 _:mrnToken.            
            }
.}
ghost commented 8 years ago

PREFIX roo:http://www.cancerdata.org/roo/ PREFIX ncit:http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl# prefix cip: http://www.uhn.ca/cip-ontology#

CONSTRUCT {?token a ncit:C16960. } WHERE { SERVICE http://localhost:9999/blazegraph/namespace/CipOutcomes/sparql { ?patient a cip:Patient. ?patient cip:mrn ?mrn. BIND(STRDT(?mrn, xsd:string) as ?mrnFormalString) } SERVICE http://localhost:9999/blazegraph/namespace/mrn2token/sparql { :mrnToken roo:100042 ?mrnFormalString. ?token roo:100318 :mrnToken. } .}

andredekker commented 8 years ago

Successful. Some issues with blazegraph, weird behaviour. Also blazegraph does not allow authorization so we ended up putting the CIPOutcomes triples in blazegraph (from a QA graph) rather than use Jena. In the end we managed to insert 67 patients in the local blazegraph. Still need to post the triples to an outside (e.g. RIS) SPARQL endpoint.