bio2rdf / bio2rdf-scripts

Scripts that Bio2RDF users have created to generate RDF versions of scientific datasets
http://bio2rdf.org/
Other
129 stars 46 forks source link

ClinicalTrials.gov - missing adverse event results #373

Closed rkboyce closed 10 years ago

rkboyce commented 10 years ago

As part of OHDSI (http://ohdsi.org/), and open collaborative, we are looking at various data sources that would be helpful for identifying drugs without adverse events. ClinicalTrials.gov is actually a potentially very useful source. We explain how to get adverse events from the XML of a CT.gov entry here: http://bit.ly/1iPhq9W

I tried to find the same information in bio2rdf ClinicalTrials.gov but was unable to. There does not seem to be an "event" resource anywhere in the graph. Though, there are resources that represent outcomes (primary and secondary) and note if the if the outcomes are safety related. When you can spare a minute, would you please check to see if the XML the scripts you use to create the resource might not be properly loading "event" resources?

Thank you, -Rich Boyce

micheldumontier commented 10 years ago

Hi Rich, Ok. Looking at the script [1], we don't currently process "reported events", "serious events", and "other events". We'll need to add support for doing this.

[1] https://github.com/bio2rdf/bio2rdf-scripts/blob/release3/clinicaltrials/clinicaltrials.php

rkboyce commented 10 years ago

Thank you Michel. I can see that the events are now present. Its not clear to me though how to query them. For example, how would we be able to get from the primary set of resources for a trial (say http://bit.ly/1hYfICi) to the list of adverse event for each arm? I can see the list of adverse events by different group names (e.g., http://bit.ly/1mONyyQ for 'E4') but I don't see an explicit predicate connecting the trial data to those lists that could be used in a SPARQL query.

Would it be possible to make queries like the following possible (non-supported predicates starred))?

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX ct: <http://bio2rdf.org/clinicaltrials:>
PREFIX ctv: <http://bio2rdf.org/clinicaltrials_vocabulary:>

SELECT ?trialURI ?completionDate ?interventionURI ?interventionLabel ?conditionURI ?conditionLabel
WHERE {
  ?interventionURI dct:title "ranibizumab"@en.

  ?trialURI a ctv:Clinical-Study;
     ctv:intervention ?interventionURI;
     rdfs:label ?interventionLabel;
     ctv:condition ?conditionURI; 
     ctv:completion-date ?completionDate; 
     *ctv:serious_event_group* ?seGrp;

  ?se a *ctv:Serious-Event*;
     ctv:group ?seGrp;
     rdfs:label ?se;
     ctv:subjects-affected ?seAff;
     ctv:subjects-at-risk ?seAtRisk; 
}

Something similar would be likely useful for the "other events" too.

vojtechhuser commented 10 years ago

I looked at Rich's queries and perhaps we will need to extend the RDF producing scripts even more to make that connection. (so yet another version of CT.gov RDF)

micheldumontier commented 10 years ago

quite right! i fixed the script, and now you should be able to get the link between the trial and event group and the events

PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX dc: http://purl.org/dc/elements/1.1/ PREFIX dct: http://purl.org/dc/terms/ PREFIX ct: http://bio2rdf.org/clinicaltrials: PREFIX ctv: http://bio2rdf.org/clinicaltrials_vocabulary:

SELECT * #?trialURI ?completionDate ?interventionURI ?interventionLabel ?conditionURI ?conditionLabel WHERE { ?interventionURI dct:title "ranibizumab"@en.

?trialURI a ctv:Clinical-Study; ctv:intervention ?interventionURI; rdfs:label ?interventionLabel; ctv:condition ?conditionURI; ctv:completion-date ?completionDate; ctv:event-group ?seGrp .

?se a ?type ; rdfs:label ?sel; ctv:group ?seGrp; ctv:subjects-affected ?seAff; ctv:subjects-at-risk ?seAtRisk .

?type rdfs:label ?type_label .

rkboyce commented 10 years ago

That looks like the change we needed Michel. Thanks for being so responsive. We will do some more testing and begin to work out a mapping of the drugs and interventions to RxNorm and SNOMED.

Vojtech, if you get an error running the query posted in Michel's reply, you might try replacing the PREFIX statements with the following:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX ct: <http://bio2rdf.org/clinicaltrials:>
PREFIX ctv: <http://bio2rdf.org/clinicaltrials_vocabulary:>