Closed migalkin closed 7 years ago
This seems to be a problem of the parser. The parser somehow sees this as an OR function with 10 parameters, whereas it actually are 9 OR functions with 2 parameters each.
Fixed by https://github.com/RubenVerborgh/SPARQL.js/commit/2789d3f27eae364ce73ec03b7a7659dd30c2f126, which will be installed by 84027fcd21fd8ae90fb7c0aef9b21f8c21b0fd43.
I'm using the npm version of the client, so how can I apply the /develop branch of the client in my app of it's not yet in npm?
What you can do in package.json is:
"ldf-client": "LinkedDataFragments/Client.js#develop"
to use the develop branch of this repo.
The problem occured again, on fedbench CD, LD and LS queries. We encounter two types of errors:
exception on query: SELECT ?film ?genre ?x WHERE {
?x <http://data.linkedmdb.org/resource/movie/genre> ?genre .
?x <http://www.w3.org/2002/07/owl#sameAs> ?film . FILTER ((?film=<http://dbpedia.org/resource/Remember_Me%2C_My_Love>) || (?film=<http://dbpedia.org/resource/But_Forever_in_My_Mind>) || (?film=<http://dbpedia.org/resource/Ecco_fatto>) || (?film=<http://dbpedia.org/resource/L%27ultimo_bacio>) || (?film=<http://dbpedia.org/resource/Seven_Pounds>) || (?film=<http://dbpedia.org/resource/The_Pursuit_of_Happyness>))
} LIMIT 10000 OFFSET 0
Code: ''
The second includes :
in URIs
exception on query: SELECT ?mass ?cas ?keggDrug WHERE {
?keggDrug <http://bio2rdf.org/ns/bio2rdf#xRef> ?cas .
?keggDrug <http://bio2rdf.org/ns/bio2rdf#mass> ?mass
FILTER ((?mass > '5')) . FILTER ((?cas=<http://bio2rdf.org/cas:198153-51-4>) || (?cas=<http://bio2rdf.org/cas:105857-23-6>) || (?cas=<http://bio2rdf.org/cas:74899-72-2>) || (?cas=<http://bio2rdf.org/cas:99210-65-8>) || (?cas=<http://bio2rdf.org/cas:77907-69-8>) || (?cas=<http://bio2rdf.org/cas:145155-23-3>) || (?cas=<http://bio2rdf.org/cas:140608-64-6>) || (?cas=<http://bio2rdf.org/cas:59865-13-3>) || (?cas=<http://bio2rdf.org/cas:214745-43-4>) || (?cas=<http://bio2rdf.org/cas:83150-76-9>))
} LIMIT 10000 OFFSET 0
Code: ''
In both cases the error code is empty, so we can't recognize the exact place of the error. The only suggestion about the 2nd error type is about colons in URIs, but serdi did not consider it as an error during HDT parsing.
Does the underlying Client.js implementation give anything? Because exception on query
and Code:
are not strings I recognize from the Client.js software, so they must be from your software. Is there any extra output we can rely on (there should be).
The only error is produced by the KEGG endpoint:
[Sat Dec 10 2016 18:26:20 GMT+0000 (UTC)] DEBUG TriplePatternIterator 159657 59569 {"?keggDrug":"http://bio2rdf.org/dr:D03842","?id":"http://bio2rdf.org/pubchem:17397928"} dr title ?title. 1
[Sat Dec 10 2016 18:26:20 GMT+0000 (UTC)] DEBUG TriplePatternIterator 159658 59570 {"?keggDrug":"http://bio2rdf.org/dr:D03843"} dr xRef ?id. 5
[Sat Dec 10 2016 18:26:20 GMT+0000 (UTC)] WARNING TriplePatternIterator Unexpected "<http://bio2rdf.org/cas:" on line 38.
events.js:160
throw er; // Unhandled 'error' event
^
Error: Unexpected "<http://bio2rdf.org/cas:" on line 38.
at N3Lexer._syntaxError (/ldf_rest/node_modules/n3/lib/N3Lexer.js:360:12)
at reportSyntaxError (/ldf_rest/node_modules/n3/lib/N3Lexer.js:327:54)
at N3Lexer._tokenizeToEnd (/ldf_rest/node_modules/n3/lib/N3Lexer.js:313:18)
at TrigFragmentIterator._parseData (/ldf_rest/node_modules/n3/lib/N3Lexer.js:395:16)
at TrigFragmentIterator.TurtleFragmentIterator._transform (/ldf_rest/node_modules/ldf-client/lib/triple-pattern-fragments/TurtleFragmentIterator.js:47:8)
at readAndTransform (/ldf_rest/node_modules/asynciterator/asynciterator.js:959:12)
at TrigFragmentIterator.TransformIterator._read (/ldf_rest/node_modules/asynciterator/asynciterator.js:945:3)
at TrigFragmentIterator.BufferedIterator._fillBuffer (/ldf_rest/node_modules/asynciterator/asynciterator.js:768:10)
at Immediate.fillBufferAsyncCallback (/ldf_rest/node_modules/asynciterator/asynciterator.js:800:8)
at runCallback (timers.js:639:20)
It seems that the parser considers http://bio2rdf.org/cas:…
an invalid URL. Let me investigate.
Hmmm, I cannot reproduce this in the parser. Would you be able to send me the contents of the fragment page that produces this error? (Or at least the full contents of line 38?)
The error message seems to indicate that a spacing character appears after http://bio2rdf.org/cas:
, which would make the URI invalid.
What do you mean under the fragment page? The client logs tons of subqueries like
[Sat Dec 10 2016 18:26:20 GMT+0000 (UTC)] INFO HttpClient Requesting http://kegg-ldh:3000/kegg?subject=http%3A%2F%2Fbio2rdf.org%2Fcpd%3AC00852&predicate=http%3A%2F%2Fbio2rdf.org%2Fns%2Fbio2rdf%23xRef
[Sat Dec 10 2016 18:26:20 GMT+0000 (UTC)] DEBUG TriplePatternIterator 159654 58803 {"?keggDrug":"http://bio2rdf.org/cpd:C00850","?mass":"\"173.9987\""} ?keggDrug xRef ?cas. 1
[Sat Dec 10 2016 18:26:20 GMT+0000 (UTC)] INFO HttpClient Requesting http://kegg-ldh:3000/kegg?subject=http%3A%2F%2Fbio2rdf.org%2Fdr%3AD03845&predicate=http%3A%2F%2Fbio2rdf.org%2Fns%2Fbio2rdf%23xRef
[Sat Dec 10 2016 18:26:20 GMT+0000 (UTC)] INFO HttpClient Requesting http://kegg-ldh:3000/kegg?subject=http%3A%2F%2Fbio2rdf.org%2Fcpd%3AC00853&predicate=http%3A%2F%2Fbio2rdf.org%2Fns%2Fbio2rdf%23xRef
[Sat Dec 10 2016 18:26:20 GMT+0000 (UTC)] DEBUG TriplePatternIterator 159655 59569 {"?keggDrug":"http://bio2rdf.org/dr:D03842","?id":"http://bio2rdf.org/cas:143224-34-4"} dr title ?title. 1
[Sat Dec 10 2016 18:26:20 GMT+0000 (UTC)] DEBUG TriplePatternIterator 159656 59569 {"?keggDrug":"http://bio2rdf.org/dr:D03842","?id":"http://bio2rdf.org/ligandbox:D03842"} dr title ?title. 1
[Sat Dec 10 2016 18:26:20 GMT+0000 (UTC)] DEBUG TriplePatternIterator 159657 59569 {"?keggDrug":"http://bio2rdf.org/dr:D03842","?id":"http://bio2rdf.org/pubchem:17397928"} dr title ?title. 1
We spot the problem with kegg at the fedbench LSD7 query:
SELECT ?drug ?transform ?mass WHERE {
?drug drugbank:affectedOrganism 'Humans and other mammals'.
?drug drugbank:casRegistryNumber ?cas .
?keggDrug bio2rdf:xRef ?cas .
?keggDrug bio2rdf:mass ?mass
FILTER ( ?mass > '5' )
OPTIONAL { ?drug drugbank:biotransformation ?transform . } }
The endpoint can't answer the following decomposed and all the subsequent queries:
SELECT ?mass ?cas ?keggDrug WHERE {
?keggDrug <http://bio2rdf.org/ns/bio2rdf#xRef> ?cas .
?keggDrug <http://bio2rdf.org/ns/bio2rdf#mass> ?mass
FILTER ((?mass > '5')) . FILTER ((?cas=<http://bio2rdf.org/cas:50-56-6>) || (?cas=<http://bio2rdf.org/cas:50-81-7>) || (?cas=<http://bio2rdf.org/cas:73-22-3>) || (?cas=<http://bio2rdf.org/cas:59-30-3>) || (?cas=<http://bio2rdf.org/cas:59-02-9>) || (?cas=<http://bio2rdf.org/cas:81093-37-0>) || (?cas=<http://bio2rdf.org/cas:54739-18-3>) || (?cas=<http://bio2rdf.org/cas:137862-53-4>) || (?cas=<http://bio2rdf.org/cas:87333-19-5>) || (?cas=<http://bio2rdf.org/cas:300-62-9>))
There are no URIs with a space to my knowledge. Also, no prefixes are used in this case.
What do you mean under the fragment page?
The error you get (Unexpected "<http://bio2rdf.org/cas:" on line 38.
) is because, at some point, the client receives an HTTP response from the server that has an invalid URI on line 38. So one of the resources such as http://kegg-ldh:3000/kegg?subject=http%3A%2F%2Fbio2rdf.org%2Fcpd%3AC00853&predicate=http%3A%2F%2Fbio2rdf.org%2Fns%2Fbio2rdf%23xRef
has an error on line 38.
But the decomposed query you gave and the kegg.hdt file should trigger the same error on my side; I will check that ASAP and get back to you.
I did the following:
SELECT * WHERE { ?s ?p ?o }
to retrieve all triplesI did not run into the error.
I wonder if you are perhaps a) using a different HDT file b) using a different parser version. To check the latter, you can do npm ls n3
, which shows version 0.8.3 on my machine.
BTW Also a note regarding your experiment: the LDF client does currently not optimize FILTER
, so the filter will be applied after the whole BGP has been evaluated, which might not be the best choice. If you would like us to implement such an optimization, please get in touch.
Used the HDT file by the link, the parser version is
npm info it worked if it ends with ok
npm info using npm@3.10.9
npm info using node@v7.1.0
| +-- n3@0.8.3
npm info ok
The heap memory issue when evaluating FILTERs against large sets of triples was apparently solved by increasing max_old_space parameter.
I cannot reproduce the parsing issue then, I'm afraid. If you could somehow show me that line 38, that would help.
Hello, still trying out some queries against the data in the LDF server using the Client. I'm trying to evaluate the following query:
The Client throws the error:
So we can't submit more than 2 arguments in the FILTER clause?