Closed alphaville closed 13 years ago
Pantelis, this timeout is thrown at your service. You should check your web-server error and access logs.
I think this is of some interest: http://opentox.ntua.gr/index.php/blog/76-rdf-opentox-discussion?showall=1&limitstart= - Especially the last paragraph about RDF vs ARFF. It explains the timeout. Could you provide ARFF along with RDF for datasets?
the timeout in this task (http://toxcreate2.in-silico.ch/task/4332) happens during a simple get request to your service:
rest_params:
:headers:
:accept: application/rdf+xml
:subjectid: AQIC5wM2LY4SfcyXpalQEtoyjxZzHhZIMARV18Unjdb27k8=@AAJTSQACMDE=#
:payload:
:rest_uri: http://opentox.ntua.gr:8080/model/9b84be8c-87d6-4405-aad8-bd7cfc81251e
so, this should have nothing to do with rdf parsing.
not sure if we will provide arff. It should be not too much of an effort, but I think christoph and nina planned to replace rdf owl-dl with a fixed datamodel, represented for example in jason.
Did you read the blog? RDF parsing consumes 2.79GB compared to 1MB for ARFF. (see http://opentox.ntua.gr/index.php/blog/76-rdf-opentox-discussion?showall=1&limitstart= ). It consumes all the RAM of the server and it starts using swap and it responds too slowly.
Sorry, Pantelis I did not read the blog completely.
But the timeout occurs during a simple get request to your model. Do you parse a big rdf file when a get request to an existing model is performed?
One more thing Pantelis: I do think that rdf scalablity issue is a severe problem, and we have to solve it, and and it is good that do you this investigations. But IMHO this should still never cause timeouts during the model building process. This is what tasks are for. First the model building service return the task to the client. Then it starts proceeding the rdf data.
Yes, that's right. First a task is created with status QUEUED. Up to that time nothing happens. After that, and unless there not more than 2 other tasks running on the system, the task is submitted to the execution pool. Then it starts downloading and parsing stuff and stuffs the memory with RDF triples... and then the system hangs and crawls ;) Any other running tasks hang too! Even the apache server is dead at that point. If you stand in front of the screen of this computer you're hardly able to move the mouse pointer. The reason is because the whole RAM is occupied and in some cases even half of the swap space!!! The same holds for GET on /task/id. Therefore, it is a matter of RDF scalability.
;-))) very nice description. I see. We should enforce the scalability issue on the mailing list...
the task is submitted to the execution pool. Then it starts downloading and parsing stuff and stuffs the memory with >RDF triples... and then the system hangs and crawls ;)
It seems like the downloading and parsing is done on one go. If so, will be less blocking, if the task is accepted and the task URI is returned immediately. Then the download starts and writes data into a file,and only upon completion is parsed into RDF.
The task is returned immediately to the client. This is the first action, before downloading or parsing anything, a task is created which (if the server does not run lots of other jobs) is submitted for execution. The HTTP connection is closed immediately and the client does not need to wait for anything. No Timeouts are expected. Except if... the machine can't take it because some task running in the background consumes all resources.
Did some tests with this dataset
public void readRDF() {
Model jenaModel = ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM);
long mem0 = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
System.out.println("Memory used: " + mem0/1024 + " K bytes");
long now = System.currentTimeMillis();
jenaModel.read("http://apps.ideaconsult.net:8080/ambit2/dataset/585036",null);
long mem1 = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
System.out.println("Memory used for Jena object " + (mem1 - mem0)/(1024) + " K bytes");
System.out.println("Dataset read in "+ (System.currentTimeMillis() - now) + " ms");
}
Printout from the code above, when using OWL model
Model jenaModel = ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM);
Memory used: 3622 K bytes
Memory used for Jena object 245429 K bytes
Dataset read in 144273 ms
Printout from the code above, when using non-OWL mode
Model jenaModel = ModelFactory.createDefaultModel();
Memory used: 1358 K bytes
Memory used for Jena object 243377 K bytes
Dataset read in 108253 ms
At the worst case it is 245MB in memory, not in anyway close to 2.5 GB .
Occasionally, a Request Timeout exception is thrown (see for example http://toxcreate2.in-silico.ch/task/4332 ). I have the feeling that this happens under heavy load on the server. I've checked the response times at opentox.ntua.gr:8080 and they remain low (see http://ambit.uni-plovdiv.bg/cgi-bin/smokeping.cgi?target=NTUA and http://opentox.ntua.gr:8080/monitoring).