openphacts / GLOBAL

Global project issues [private for now. owner lee harland]
3 stars 0 forks source link

pathway_organism parameter in pathway calls gives internal error (500) #183

Closed danidi closed 9 years ago

danidi commented 9 years ago

The pathways for target/compound/publication API calls return error code 500 when the pathway_organism parameter is defined (e.g. with Homo sapiens). It works fine in the general "Pathways: List" call.

stain commented 9 years ago

(my notes):

Example of 500 error: https://beta.openphacts.org/1.4/pathways/byReference?uri=http%3A%2F%2Fidentifiers.org%2Fpubmed%2F9789062&app_id=XXXX&app_key=XXXX&pathway_organism=Homosapiens

gives:

Puelia: an implementation of the Linked Data API Internal Server Error Sorry, there was an internal error in serving this request, possibly due to an upstream server, or a configuration error.

It happens no matter what pathway_organism is specified (ie. probably not due to escaping of the space in Homo%20sapiens).

Testing on ops2, I get a 404 for any pathway_organism specified:

http://ops2.few.vu.nl/pathways/byReference?pathway_organism=Homo%20sapiens&uri=http%3A%2F%2Fidentifiers.org%2Fpubmed%2F9789062

Yet of course this should work, as "Homo sapiens" is listed as the pathway_organism in http://ops2.few.vu.nl/pathways/byReference?uri=http%3A%2F%2Fidentifiers.org%2Fpubmed%2F9789062

<?xml version="1.0" ?>
<result format="linked-data-api" href="http://ops2.few.vu.nl/pathways/byReference?uri=http%3A%2F%2Fidentifiers.org%2Fpubmed%2F9789062&_page=1" version="1.4">
  <label>Pathways for Publication: List</label>
  <first href="http://ops2.few.vu.nl/pathways/byReference?uri=http%3A%2F%2Fidentifiers.org%2Fpubmed%2F9789062&_page=1"/>
  <type href="http://purl.org/linked-data/api/vocab#Page"/>
  <items>
    <item href="http://rdf.wikipathways.org/Pathway/WP1533_r49537">
      <page href="http://www.wikipathways.org/instance/WP1533_r49537"/>
      <pathwayOntology>
        <item href="http://purl.obolibrary.org/obo/DOID_13381"/>
        <item href="http://purl.obolibrary.org/obo/PW_0000397"/>
      </pathwayOntology>
      <pathway_organism href="http://purl.obolibrary.org/obo/NCBITaxon_9606">
        <label datatype="string">Homo sapiens</label>
      </pathway_organism>
stain commented 9 years ago

(Further notes):

Seems to be cause by filtering on the wrong variable:

wp:organism api:name "pathway_organism" ;
        api:label "pathway_organism" ;
        api:value "The rdfs:label for the pathway organism (URL encode). e.g.: Homo sapiens." ;
        api:filterVariable "?item";
        a rdf:Property .

yet filtering on ?organism_uri (or ?organism) doesn't work - is it because the organism label is typed as rdfs:string? (in the SPARQL query I have to add ^^xsd:string) to get results.

Edit: Still learning this.. my apologies. The filterVariable was correct as it is the ?item that has wp:organism. Still investigating..

stain commented 9 years ago

Error log is not happy if pathway_organism is included:

[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string(): Entity: line 206: parser error : Opening and ending tag mismatch: HR line 206 and body in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string(): logs.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.22</h3></body> in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string():                                                                                ^ in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string(): Entity: line 206: parser error : Opening and ending tag mismatch: HR line 1 and html in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string(): u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.22</h3></body></html> in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string():                                                                                ^ in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string(): Entity: line 206: parser error : Premature end of data in tag body line 1 in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string(): u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.22</h3></body></html> in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string():                                                                                ^ in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string(): Entity: line 206: parser error : Premature end of data in tag html line 1 in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string(): u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.22</h3></body></html> in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Warning:  simplexml_load_string():                                                                                ^ in /var/www/html/ops_ims.class.php on line 176
[Mon Dec 01 14:14:57 2014] [error] [client 84.92.48.26] PHP Notice:  Trying to get property of non-object in /var/www/html/ops_ims.class.php on line 176
stain commented 9 years ago

Seems we get a 500 error from the IMS.

http://openphacts.cs.man.ac.uk:9091/QueryExpander/expandXML?query=PREFIX+wp%3A+%3Chttp%3A%2F%2Fvocabularies.wikipathways.org%2Fwp%23%3E%0APREFIX+dc%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F%3E%0APREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0APREFIX+dcterms%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%3E%0APREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0ASELECT+DISTINCT+%3Fitem++WHERE+{GRAPH+%3Chttp%3A%2F%2Fwww.wikipathways.org%3E+{+{+%3Fitem%3Chttp%3A%2F%2Fvocabularies.wikipathways.org%2Fwp%23organism%3E+%22Homo+Sapiens%22.+*%23*+%0A%09%3Fitem+a+wp%3APathway+%3B%0A%09%09dc%3Atitle+%3Ftitle+%3B+%0A%09%09wp%3Aorganism+%3Forganism_uri+%3B+%0A%09%09foaf%3Apage+%3Fpage+%3B%0A%09%09dc%3Aidentifier+%3Fidentifier+.+%0A++++++++%3Fpw_uri+dcterms%3AisPartOf+%3Fitem+%3B%0A++++++++++++++++a+wp%3APublicationReference+.%0A%09%3Forganism_uri+rdfs%3Alabel+%3Forganism+.+%0A%09OPTIONAL+{+%3Fitem+dcterms%3Adescription+%3Fdescription+}%0A%09OPTIONAL+{+%3Fitem+wp%3ApathwayOntology+%3Fontology+}%0A}+}+ORDER+BY+%3Fitem++LIMIT+10+OFFSET+0&parameter=%3Fpw_uri&lensUri=Default&inputURI=http%3A%2F%2Fidentifiers.org%2Fpubmed%2F11967526

If I leave out the organism_name it instead expands fine:

http://openphacts.cs.man.ac.uk:9091/QueryExpander/expandXML?query=PREFIX+dc%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2F%3E%0APREFIX+wp%3A+%3Chttp%3A%2F%2Fvocabularies.wikipathways.org%2Fwp%23%3E%0APREFIX+foaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F%3E%0APREFIX+dcterms%3A+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2F%3E%0APREFIX+void%3A+%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E%0APREFIX+rdfs%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0ACONSTRUCT+{+%3Fitem+dc%3Atitle+%3Ftitle+%3B+%0A%09wp%3Aorganism+%3Forganism_uri+%3B+%0A%09foaf%3Apage+%3Fpage+%3B%0A%09dc%3Aidentifier+%3Fidentifier+%3B%0A%09dcterms%3Adescription+%3Fdescription+%3B%0A%09wp%3ApathwayOntology+%3Fontology+%3B%0A%09dcterms%3AhasPart+%3Fpw_uri+%3B%0A%09void%3AinDataset+%3Chttp%3A%2F%2Fwww.wikipathways.org%3E+.%0A%3Forganism_uri+rdfs%3Alabel+%3Forganism+.+%0A%3Fpw_uri+a+wp%3APublicationReference+.+%0A++}++WHERE+{+GRAPH+%3Chttp%3A%2F%2Fwww.wikipathways.org%3E+{%0A%09%3Fitem+a+wp%3APathway+%3B%0A%09%09dc%3Atitle+%3Ftitle+%3B+%0A%09%09wp%3Aorganism+%3Forganism_uri+%3B+%0A%09%09foaf%3Apage+%3Fpage+%3B%0A%09%09dc%3Aidentifier+%3Fidentifier+.+%0A++++++++%3Fpw_uri+dcterms%3AisPartOf+%3Fitem+%3B%0A++++++++++++++++a+wp%3APublicationReference+.%0A%09%3Forganism_uri+rdfs%3Alabel+%3Forganism+.+%0A%09OPTIONAL+{+%3Fitem+dcterms%3Adescription+%3Fdescription+}%0A%09OPTIONAL+{+%3Fitem+wp%3ApathwayOntology+%3Fontology+}%0A}+}&parameter=%3Fpw_uri&lensUri=Default&inputURI=http%3A%2F%2Fidentifiers.org%2Fpubmed%2F11967526

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><URL><expandedQuery>CONSTRUCT {
 ?item&lt;http://purl.org/dc/elements/1.1/title&gt;  ?title . 
 ?item&lt;http://vocabularies.wikipathways.org/wp#organism&gt;  ?organism_uri . 
 ?item&lt;http://xmlns.com/foaf/0.1/page&gt;  ?page . 
 ?item&lt;http://purl.org/dc/elements/1.1/identifier&gt;  ?identifier . 
 ?item&lt;http://purl.org/dc/terms/description&gt;  ?description . 
 ?item&lt;http://vocabularies.wikipathways.org/wp#pathwayOntology&gt;  ?ontology . 
 ?item&lt;http://purl.org/dc/terms/hasPart&gt;  ?pw_uri . 
 ?item&lt;http://rdfs.org/ns/void#inDataset&gt; &lt;http://www.wikipathways.org&gt;  . 
 ?organism_uri&lt;http://www.w3.org/2000/01/rdf-schema#label&gt;  ?organism . 
 ?pw_uri&lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://vocabularies.wikipathways.org/wp#PublicationReference&gt;  . } 

WHERE {
 GRAPH &lt;http://www.wikipathways.org&gt;  {
 ?item &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://vocabularies.wikipathways.org/wp#Pathway&gt; . 
 ?item &lt;http://purl.org/dc/elements/1.1/title&gt;  ?title . 
 ?item &lt;http://vocabularies.wikipathways.org/wp#organism&gt;  ?organism_uri . 
 ?item &lt;http://xmlns.com/foaf/0.1/page&gt;  ?page . 
 ?item &lt;http://purl.org/dc/elements/1.1/identifier&gt;  ?identifier . 
 ?pw_uri &lt;http://purl.org/dc/terms/isPartOf&gt;  ?item . 
 ?pw_uri &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#type&gt; &lt;http://vocabularies.wikipathways.org/wp#PublicationReference&gt; . 
 ?organism_uri &lt;http://www.w3.org/2000/01/rdf-schema#label&gt;  ?organism . 
OPTIONAL { 
 ?item &lt;http://purl.org/dc/terms/description&gt;  ?description . 
 } 

OPTIONAL { 
 ?item &lt;http://vocabularies.wikipathways.org/wp#pathwayOntology&gt;  ?ontology . 
 } 

FILTeR (?pw_uri = &lt;http://identifiers.org/pubmed/11967526&gt;)
 } 

 } 
</expandedQuery><orginalQuery>PREFIX dc: &lt;http://purl.org/dc/elements/1.1/&gt;
PREFIX wp: &lt;http://vocabularies.wikipathways.org/wp#&gt;
PREFIX foaf: &lt;http://xmlns.com/foaf/0.1/&gt;
PREFIX dcterms: &lt;http://purl.org/dc/terms/&gt;
PREFIX void: &lt;http://rdfs.org/ns/void#&gt;
PREFIX rdfs: &lt;http://www.w3.org/2000/01/rdf-schema#&gt;
CONSTRUCT { ?item dc:title ?title ; 
    wp:organism ?organism_uri ; 
    foaf:page ?page ;
    dc:identifier ?identifier ;
    dcterms:description ?description ;
    wp:pathwayOntology ?ontology ;
    dcterms:hasPart ?pw_uri ;
    void:inDataset &lt;http://www.wikipathways.org&gt; .
?organism_uri rdfs:label ?organism . 
?pw_uri a wp:PublicationReference . 
  }  WHERE { GRAPH &lt;http://www.wikipathways.org&gt; {
    ?item a wp:Pathway ;
        dc:title ?title ; 
        wp:organism ?organism_uri ; 
        foaf:page ?page ;
        dc:identifier ?identifier . 
        ?pw_uri dcterms:isPartOf ?item ;
                a wp:PublicationReference .
    ?organism_uri rdfs:label ?organism . 
    OPTIONAL { ?item dcterms:description ?description }
    OPTIONAL { ?item wp:pathwayOntology ?ontology }
} }</orginalQuery></URL>
stain commented 9 years ago
SELECT DISTINCT ?item  WHERE {GRAPH <http://www.wikipathways.org> { { ?item<http://vocabularies.wikipathways.org/wp#organism> "Homo Sapiens". *#* 

...

root cause

org.openrdf.query.MalformedQueryException: Encountered " "*" "* "" at line 6, column 143.
Was expecting one of:
stain commented 9 years ago

@Christian-B any idea on what that "*"thing is meant to be..? Sorry for being clueless!

edit: Never mind - *#* is meant to be replaced with } by expandQueryThroughExpander() on the PHP side.. Comments in code as to why would be nice. :)

stain commented 9 years ago

Fixed. Bug in ops_ims.class.php in that "*" was not replaced before passing to IMS.

stain commented 9 years ago

Deployed on ops2 / devel. Could you check, @danidi ? Please also paste the URIs that work/don't work.

Example: http://ops2.few.vu.nl/pathways/byReference?pathway_organism=Homo%20sapiens&uri=http%3A%2F%2Fidentifiers.org%2Fpubmed%2F9789062

The organism names that are supported are:

<http://purl.obolibrary.org/obo/NCBITaxon_7165> api:name "Anopheles gambiae" .
<http://purl.obolibrary.org/obo/NCBITaxon_3702> api:name "Arabidopsis thaliana" .
<http://purl.obolibrary.org/obo/NCBITaxon_1423> api:name "Bacillus subtilis" .
<http://purl.obolibrary.org/obo/NCBITaxon_9913> api:name "Bos taurus" .
<http://purl.obolibrary.org/obo/NCBITaxon_6239> api:name "Caenorhabditis elegans" .
<http://purl.obolibrary.org/obo/NCBITaxon_9615> api:name "Canis familiaris" .
<http://purl.obolibrary.org/obo/NCBITaxon_7955> api:name "Danio rerio" .
<http://purl.obolibrary.org/obo/NCBITaxon_7227> api:name "Drosophila melanogaster" .
<http://purl.obolibrary.org/obo/NCBITaxon_9796> api:name "Equus caballus" .
<http://purl.obolibrary.org/obo/NCBITaxon_9031> api:name "Gallus gallus" .
<http://purl.obolibrary.org/obo/NCBITaxon_5518> api:name "Gibberella zeae" .
<http://purl.obolibrary.org/obo/NCBITaxon_9606> api:name "Homo sapiens" .
<http://purl.obolibrary.org/obo/NCBITaxon_10090> api:name "Mus musculus" .
<http://purl.obolibrary.org/obo/NCBITaxon_1773> api:name "Mycobacterium tuberculosis" .
<http://purl.obolibrary.org/obo/NCBITaxon_4530> api:name "Oryza sativa" .
<http://purl.obolibrary.org/obo/NCBITaxon_9598> api:name "Pan troglodytes" .
<http://purl.obolibrary.org/obo/NCBITaxon_10116> api:name "Rattus norvegicus" .
<http://purl.obolibrary.org/obo/NCBITaxon_4932> api:name "Saccharomyces cerevisiae" .
<http://purl.obolibrary.org/obo/NCBITaxon_4577> api:name "Zea mays" .
danidi commented 9 years ago

Thank you! I tested:

https://ops2.few.vu.nl/pathways/byCompound?uri=http%3A%2F%2Fwww.conceptwiki.org%2Fconcept%2F83931753-9e3f-4e90-b104-e3bcd0b4d833&app_id=XXX&app_key=YYY&pathway_organism=Homo+sapiens -> works fine.

https://ops2.few.vu.nl/pathways/byCompound?uri=http%3A%2F%2Fwww.conceptwiki.org%2Fconcept%2F83931753-9e3f-4e90-b104-e3bcd0b4d833&app_id=XXX&app_key=YYY&pathway_organism=Bos+taurus -> works fine.

https://ops2.few.vu.nl/pathways/byCompound?uri=http%3A%2F%2Fwww.conceptwiki.org%2Fconcept%2F83931753-9e3f-4e90-b104-e3bcd0b4d833&app_id=f91c5b2b&app_key=18a5d823d0e4933ac5fe22a3d52974c1&pathway_organism=Homo+sapiens%7CBos+taurus -> gives a 404

Count calls show the same behaviour: single organisms work fine, multiple organisms separated by | return 0 results (although there should be more). Should I raise this as a different issue? The main issue seems to be solved.

stain commented 9 years ago

I didn't know multiple organisms were meant to be supported - that sounds like a new feature request. Could you raise that separately, @danidi ?

danidi commented 9 years ago

Done. See https://github.com/openphacts/GLOBAL/issues/223