alphaville / ToxOtis

An HTTP client for consuming OpenTox predictive toxicology web services.
http://opentox.ntua.gr/wiki
6 stars 1 forks source link

Large parsing time when bad rdf is provided to Jena #38

Closed alphaville closed 13 years ago

alphaville commented 13 years ago

The following test:

public void testDownload_badRDF() throws ToxOtisException, ServiceInvocationException { try { new Algorithm(Services.ntua().augment("algorithm", "mlr"). addUrlParameter("media", "text/html")).loadFromRemote(); } catch (ServiceInvocationException ex) { return; } fail("Should have failed"); }

is successful but it takes really a long time to complete (153 sec). Some debugging revealed that the problem is related to jena. The standard error output is:

------------- Standard Error ----------------- org.opentox.toxotis.exceptions.impl.RemoteServiceException: Remote service at 'http://opentox.ntua.gr:4000/algorithm/mlr?media=text%2Fhtml' did not provide a valid RDF representation. The returned representation cannot be parsed at org.opentox.toxotis.client.http.AbstractHttpClient.getResponseOntModel(AbstractHttpClient.java:350) at org.opentox.toxotis.client.http.AbstractHttpClient.getResponseOntModel(AbstractHttpClient.java:316) at org.opentox.toxotis.util.spiders.AlgorithmSpider.(AlgorithmSpider.java:168) at org.opentox.toxotis.core.component.Algorithm.loadFromRemote(Algorithm.java:244) at org.opentox.toxotis.core.component.Algorithm.loadFromRemote(Algorithm.java:79) at org.opentox.toxotis.core.OTOnlineResource.loadFromRemote(OTOnlineResource.java:127) at org.opentox.toxotis.core.component.AlgorithmTest.testDownload_badRDF(AlgorithmTest.java:110) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.internal.runners.TestMethodRunner.executeMethodBody(TestMethodRunner.java:99) at org.junit.internal.runners.TestMethodRunner.runUnprotected(TestMethodRunner.java:81) at org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAndAfterRunner.java:34) at org.junit.internal.runners.TestMethodRunner.runMethod(TestMethodRunner.java:75) at org.junit.internal.runners.TestMethodRunner.run(TestMethodRunner.java:45) at org.junit.internal.runners.TestClassMethodsRunner.invokeTestMethod(TestClassMethodsRunner.java:71) at org.junit.internal.runners.TestClassMethodsRunner.run(TestClassMethodsRunner.java:35) at org.junit.internal.runners.TestClassRunner$1.runUnprotected(TestClassRunner.java:42) at org.junit.internal.runners.BeforeAndAfterRunner.runProtected(BeforeAndAfterRunner.java:34) at org.junit.internal.runners.TestClassRunner.run(TestClassRunner.java:52) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:32) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:515) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1031) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:888) Caused by: java.lang.NullPointerException at com.hp.hpl.jena.rdf.arp.impl.XMLHandler.endElement(XMLHandler.java:149) at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source) at org.apache.xerces.impl.XMLNamespaceBinder.handleEndElement(Unknown Source) at org.apache.xerces.impl.XMLNamespaceBinder.endElement(Unknown Source) at org.apache.xerces.impl.dtd.XMLDTDValidator.endNamespaceScope(Unknown Source) at org.apache.xerces.impl.dtd.XMLDTDValidator.handleEndElement(Unknown Source) at org.apache.xerces.impl.dtd.XMLDTDValidator.endElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source) at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at com.hp.hpl.jena.rdf.arp.impl.RDFXMLParser.parse(RDFXMLParser.java:142) at com.hp.hpl.jena.rdf.arp.JenaReader.read(JenaReader.java:158) at com.hp.hpl.jena.rdf.arp.JenaReader.read(JenaReader.java:145) at com.hp.hpl.jena.rdf.arp.JenaReader.read(JenaReader.java:215) at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:226) at com.hp.hpl.jena.ontology.impl.OntModelImpl.read(OntModelImpl.java:2148) at org.opentox.toxotis.client.http.AbstractHttpClient.getResponseOntModel(AbstractHttpClient.java:342) ... 24 more

alphaville commented 13 years ago

Most probably it's Jena related. It should be handled partially by not allowing parsing of documents from URIs that don't comply with the template.

alphaville commented 13 years ago

Related to Jena