Closed callahantiff closed 2 years ago
Follow-up
@bill-baumgartner - I successfully brought down the endpoint tonight 😄. It's running again, I restarted the container and it came back. The query I ran is shown below. It went down because I did not include LIMIT
. I wonder if we should add something to protect from others doing this, or if there is something we can add to help it restart itself in these situations. Just something for us to discuss tomorrow!
PREFIX obo: <http://purl.obolibrary.org/obo/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?s ?p ?o
WHERE {
VALUES ?p {
obo:RO_0000087
obo:RO_0002434
rdfs:subClassOf
}
?s ?p ?o
}
Note when adding LIMIT n
the query executes totally fine. This query format is the template RH provided me so that's why I was testing it ot.
Good to know. From grepping our input n-triples file, we would expect the following numbers of responses:
subClassOf
: 1,340,072RO_0000087 (has_role)
: 40,500RO_0002434 (interacts_with)
: 4,042,408So, in total, this query would have eclipsed the 5M triple limit we had originally set. It may be the case that for queries with many results, users will need to request results in batches using ORDER BY
+ LIMIT
+ OFFSET
.
I agree. I did some experimenting using the SPARQL Proxy settings (i.e. ENABLE_QUERY_SPLITTING
and MAX_CHUNK_LIMIT
) in docker-compose.yml
and have some interesting insight to share in our meeting this afternoon. In a nutshell, I can get it to return all of the results, but then generate a different error when trying to return the results (which when using the ENABLE_QUERY_SPLITTING
setting returns JSON
). See below:
buffer.js:799
api_1 | return this.utf8Slice(start, end);
api_1 | ^
api_1 |
api_1 | Error: Cannot create a string longer than 0x1fffffe8 characters
api_1 | at Buffer.toString (buffer.js:799:17)
api_1 | at Request.<anonymous> (/app/node_modules/request/request.js:1128:39)
api_1 | at Request.emit (events.js:315:20)
api_1 | at IncomingMessage.<anonymous> (/app/node_modules/request/request.js:1076:12)
api_1 | at Object.onceWrapper (events.js:421:28)
api_1 | at IncomingMessage.emit (events.js:327:22)
api_1 | at endReadableNT (internal/streams/readable.js:1327:12)
api_1 | at processTicksAndRejections (internal/process/task_queues.js:80:21) {
api_1 | code: 'ERR_STRING_TOO_LONG'
api_1 | }
api_1 | npm ERR! code ELIFECYCLE
api_1 | npm ERR! errno 1
api_1 | npm ERR! sparql-proxy@0.0.0 start: `node --experimental-modules src/server.mjs`
api_1 | npm ERR! Exit status 1
api_1 | npm ERR!
api_1 | npm ERR! Failed at the sparql-proxy@0.0.0 start script.
api_1 | npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
api_1 |
api_1 | npm ERR! A complete log of this run can be found in:
api_1 | npm ERR! /app/.npm/_logs/2021-01-08T18_52_49_474Z-debug.log
@bill-baumgartner so we have a record, here are the two queries we ran against the Endpoint via the command line:
wget Simple:
wget -qO- "http://35.233.212.30/blazegraph/sparql?query=select * where { ?s ?p ?o } " > filename.xml
wget Relation Template:.
wget -qO- "http://35.233.212.30/blazegraph/sparql?query=PREFIX%20obo%3A%20%3Chttp%3A%2F%2Fpurl.obolibrary.org%2Fobo%2F%3E%20PREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%20SELECT%20%3Fs%20%3Fp%20%3Fo%20WHERE%20%7B%20VALUES%20%3Fp%20%7B%20obo%3ARO_0000087%20obo%3ARO_0002434%20rdfs%3AsubClassOf%20%7D%20%3Fs%20%3Fp%20%3Fo%20%7D" > filename.xml
Need to do the following things to fully address this issue:
When it goes down, run the following from the location shown below within the GCP instance:
~/PheKnowLator/builds/deploy/triple-store$ docker-compose up -d
@bill-baumgartner - I am going to close this for now. I think the 99% automated approach we are using now is totally fine as the endpoint is not something we plan to keep forever. Let me know if you disagree.
TASK
Task Type:
PKT DATA DELIVERY
Select and set-up a SPARQL endpoint for exploring KG build data
TODO
Questions: