openphacts / GLOBAL

Global project issues [private for now. owner lee harland]
3 stars 0 forks source link

Data Source Query does not show updates in 2.2 #396

Open nicklynch opened 6 years ago

nicklynch commented 6 years ago

The recent data updates in 2.2 are shown in the data returned from the Data Source query

http://alpha.openphacts.org:3002/2.2/sources

Shows ChEMBL 20 and older wiki pathways data

Has the void files been updated to show this new information?

ianwdunlop commented 6 years ago

The void files might be the old ones for chembl 20 (chembl_20.0_void.ttl) and wp (voidWp.ttl). If the new void files exist then the graph should be able to be cleared and new data put in.

<http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#> a void:DatasetDescription ;
  dcterms:title "ChEMBL-RDF VoID Description" ;
  dcterms:description "This is the VoID description for a ChEMBL-RDF dataset" ;
  dcterms:issued "2015-01-14T00:00:00.000Z"^^xsd:dateTime ;
  pav:createdBy <http://orcid.org/0000-0002-8011-0300> ;
  pav:createdOn "2009-10-28T00:00:00.000Z"^^xsd:dateTime ;
  pav:lastUpdateOn "2015-01-14T00:00:00.000Z"^^xsd:dateTime ;
  foaf:primaryTopic :chembl_rdf_dataset ;
  pav:previousVersion <http://rdf.ebi.ac.uk/dataset/chembl/19.0/void.ttl#>
<http://rdf.wikipathways.org/>
      a       void:DatasetDescription ;
      dcterms:title "A VoID Description of the WikiPathways sets"@en ;
    dcterms:description "VoID description for the curated and Reactome WikiPathways VoID set based upon the wp vocabulary."@en ;
    pav:createdBy <https://jenkins.bigcat.unimaas.nl/job/GPML%20to%20GPML%20+%20WP%20RDF/> ;
    pav:createdOn "2015-11-18T15:54:28.683Z"^^xsd:dateTime ;
    dcterms:issued "2015-11-18T15:54:00Z"^^xsd:dateTime ;
    pav:lastUpdateOn "2016-05-13T08:13:00Z"^^xsd:dateTime ;
    pav:previousVersion <http://rdf.wikipathways.org/release20151118/> ;
    pav:createdWith <https://jenkins.bigcat.unimaas.nl/job/GPML%20to%20GPML%20+%20WP%20RDF/> ;
      foaf:primaryTopic <http://rdf.wikipathways.org/release20151118_2/>

Here is the full list:

http://ops.rsc.org/download/20151104/void_2015-11-04.ttl#openphacts-chebi http://ops.rsc.org/download/20151104/void_2015-11-04.ttl#openphacts-chembl http://ops.rsc.org/download/20151104/void_2015-11-04.ttl#openphacts-drugbank http://ops.rsc.org/download/20151104/void_2015-11-04.ttl#openphacts-human_metabolome_database http://ops.rsc.org/download/20151104/void_2015-11-04.ttl#openphacts-mesh http://ops.rsc.org/download/20151104/void_2015-11-04.ttl#openphacts-pdb http://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/drugbank_4.1_void.ttl#drugbank-rdf http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_patent_class_dataset http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_bio_patfieldasso_dataset https://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/uniprot_2015_11_void.ttl#uniprotkb_rdf http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_chem_patfieldasso_dataset http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_patent_title_dataset http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_indication_dataset http://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/chebi_125_void.ttl#chebi http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_patent_dataset http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_chem_patasso_dataset http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_target_dataset http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_bio_patasso_dataset http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_molecule_dataset http://rdf.disgenet.org/v2.1.0/void.ttl#disgenetrdf http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_dataset http://ops.rsc.org/download/20151104/void_2015-11-04.ttl#openphactsDataset http://rdf.ebi.ac.uk/dataset/surechembl/1.1/void.ttl#surechembl_rdf_dataset http://rdf.wikipathways.org/release20151118_2/ https://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/cw_void_2013-12-12.ttl#CW_RDF https://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/goa_void_2015-02-17.ttl#GOA2RDF https://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/caloha_void_2014_01.ttl#CALOHA2RDF https://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/nx_void_2014_02.ttl#nextprot_RDF https://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/go_void_2015-03-04.ttl#GO_RDF http://rdf.disgenet.org/v2.1.0/void.ttl#disease http://aers.data2semantics.org/void.ttl#aers-ld http://rdf.disgenet.org/v2.1.0/void.ttl#gene http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_assay_dataset http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_molhierarchy_dataset http://rdf.disgenet.org/v2.1.0/void.ttl#umlsSTY http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_targetcmpt_dataset http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_document_dataset http://rdf.disgenet.org/v2.1.0/void.ttl#pantherClass http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_moa_dataset http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_protclass_dataset http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_target_dataset http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_biocmpt_dataset http://rdf.disgenet.org/v2.1.0/void.ttl#diseaseClass http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_molecule_dataset http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_activity_dataset http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_target_relation_dataset http://rdf.disgenet.org/v2.1.0/void.ttl#pathway http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_cell_line_dataset http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_unichem_dataset http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_bindingsite_dataset http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_source_dataset http://rdf.disgenet.org/v2.1.0/void.ttl#geneDiseaseAssociation http://rdf.ebi.ac.uk/dataset/chembl/20.0/void.ttl#chembl_rdf_journal_dataset

DS1

DS2

https://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/uniprot_2015_11_void.ttl#uniprotDataset https://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/uniprot_2015_11_void.ttl#uniprot_enzyme http://rdf.wikipathways.org/release20151118_2/gpml http://rdf.wikipathways.org/release20151118_2/wp http://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/drugbank_4.1_void.ttl#db-targets http://raw.githubusercontent.com/openphacts/ops-platform-setup/2.0.0/void/drugbank_4.1_void.ttl#db-drugs

from sparql

PREFIX void: <http://rdfs.org/ns/void#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX dcat: <http://www.w3.org/ns/dcat#>
SELECT  DISTINCT ?dataset   WHERE { GRAPH <http://www.openphacts.org/api/datasetDescriptors> {
?dataset a void:Dataset .
?description foaf:primaryTopic ?top_dataset .
OPTIONAL { ?dataset dcterms:title ?title . }
OPTIONAL { ?dataset void:subset ?subset . }
OPTIONAL { ?dataset dcterms:description ?dctDescription . }
OPTIONAL { ?dataset dcterms:license ?license . }
OPTIONAL { {?dataset prov:wasDerivedFrom ?derivedFrom . }
UNION { ?dataset prov:hadPrimarySource ?primarySource . }
UNION { ?dataset prov:wasQuotedFrom ?quotedFrom . }
UNION { ?dataset prov:wasRevisionOf ?revisionOf . }
}
OPTIONAL { ?dataset void:dataDump ?dataDump . }
OPTIONAL { ?dataset void:triples ?tripleNo . }
OPTIONAL { {?dataset foaf:homepage ?homepage . }
UNION { ?dataset dcat:landingPage ?landingPage . } }
}
  }
randykerber commented 6 years ago

We should first see if we can tell where this command is really getting it's info from.

Now that this comes up, a while back I vaguely recall seeing datasources output and after scanning it recognizing that it wasn't even remotely correct. And that it was coming from someplace unexpected. From some old cached file somewhere, or perhaps even something to do with conceptwiki.

AlasdairGray commented 6 years ago

It should really query the triplestore for the VoID information

stain commented 6 years ago

Yes, it’s from the http://www.openphacts.org/api/datasetDescriptors graph loaded separately from the void/ folders.

If you update chembl then it comes with its own void file that you have to add to that graph (but first clear the old graph and reload from void/ to avoid double-chembl)

-- Stian Soiland-Reyes, eScience Lab School of Computer Science, The University of Manchester http://orcid.org/0000-0001-9842-9718

From: Alasdair Graymailto:notifications@github.com Sent: 18 August 2017 13:14 To: openphacts/GLOBALmailto:GLOBAL@noreply.github.com Cc: Subscribedmailto:subscribed@noreply.github.com Subject: Re: [openphacts/GLOBAL] Data Source Query does not show updates in 2.2 (#396)

It should really query the triplestore for the VoID information

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/openphacts/GLOBAL/issues/396#issuecomment-323337770, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AAPd5XDWPjPAesHVGOtGFYvlgysTL5djks5sZYA5gaJpZM4O7Swq.

ianwdunlop commented 6 years ago

Data available via ftp including void eg ftp://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBL-RDF/23.0/

randykerber commented 6 years ago

The http://www.openphacts.org/api/datasetDescriptors graph is loaded from the void.tar file here: https://data.openphacts.org/free/2.1/rdf/https://data.openphacts.org/free/2.1/rdf/

But I've not found any script for creating or loading that.

@stain -- Do you recall how the contents of that graph were created, is there a script? I didn't find anything in the ops-platform-setup project/repo.