ftomassetti / semreview

Text classification taking advantage of the Semantic Web
GNU General Public License v3.0
3 stars 0 forks source link

javax.xml-parser doesn't parse the responseBuffer #13

Open YJ14 opened 7 years ago

YJ14 commented 7 years ago

Hello, I get document = "null" from the method getOpenLinkResponse(). It seems the javax.xml.parsers are not parsing the response correctly. and when in DocumentBuilder I set the factory.setValidating(true); I get an error:

Exception in thread "main" it.polito.semreview.dbpedia.InvalidResponseException: Invalid response obtained: Response is not a valid XML file at it.polito.semreview.dbpedia.DbPediaURIRetriever.getOpenLinkResponse(DbPediaURIRetriever.java:83) at it.polito.semreview.dbpedia.Main.main(Main.java:27) Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 68; Document root element "fct:facets", must match DOCTYPE root "null". at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:134) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:396) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:284) at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.rootElementSpecified(XMLDTDValidator.java:1599) at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.handleStartElement(XMLDTDValidator.java:1877) at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.startElement(XMLDTDValidator.java:742) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1359) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(XMLDocumentScannerImpl.java:1289) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3132) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:852) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339) at it.polito.semreview.dbpedia.DocumentParser.parse(DocumentParser.java:37) at it.polito.semreview.dbpedia.DbPediaURIRetriever.getOpenLinkResponse(DbPediaURIRetriever.java:81) ... 1 more

Any idea why?

Here is the responseBuffer:

`

select ?s1 as ?c1, (bif:search_excerpt (bif:vector ('DISEASE', 'COURSE'), ?o1)) as ?c2, ?sc, ?rank, ?g where {{{ select ?s1, (?sc * 3e-1) as ?sc, ?o1, (sql:rnk_scale (<LONG::IRI_RANK> (?s1))) as ?rank, ?g where { quad map virtrdf:DefaultQuadMap { graph ?g { ?s1 ?s1textp ?o1 . ?o1 bif:contains '(DISEASE AND COURSE)' option (score ?sc) . } } } order by desc (?sc * 3e-1 + sql:rnk_scale (<LONG::IRI_RANK> (?s1))) limit 20 offset 0 }}} 22 yes 8000 1.617K rnd 105 seq 1.184K same seg 41 same pg 7 same par 0 disk 0 spec disk 0B / 0 messages 60 fork `