buddycloud / channel-directory

The buddycloud search engine. Search and recommendations based on solr and mahaut.
https://buddycloud.org/wiki/Channel_Directory_Project
Apache License 2.0
21 stars 7 forks source link

SAX parser error #37

Open abmargb opened 9 years ago

abmargb commented 9 years ago
org.dom4j.DocumentException: Error on line 1 of document  : The reference to entity "B" must end with the ';' delimiter. Nested exception: The reference to entity "B" must end with the ';' delimiter.
    at org.dom4j.io.SAXReader.read(SAXReader.java:482)
    at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
    at com.buddycloud.channeldirectory.crawler.node.CrawlerHelper.getAtomEntry(CrawlerHelper.java:215)
    at com.buddycloud.channeldirectory.crawler.node.PostCrawler.crawl(PostCrawler.java:100)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.crawl(PubSubServerCrawler.java:195)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.crawl(PubSubServerCrawler.java:185)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.fetchAndCrawl(PubSubServerCrawler.java:224)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.crawlChannelServer(PubSubServerCrawler.java:152)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.fetch(PubSubServerCrawler.java:119)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.start(PubSubServerCrawler.java:93)
    at com.buddycloud.channeldirectory.crawler.Main.main(Main.java:63)
Nested exception: 
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 291; The reference to entity "B" must end with the ';' delimiter.
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:198)
    at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:441)
    at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:368)
    at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1436)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(XMLDocumentFragmentScannerImpl.java:1850)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3067)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:117)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:648)
    at org.dom4j.io.SAXReader.read(SAXReader.java:465)
    at org.dom4j.DocumentHelper.parseText(DocumentHelper.java:278)
    at com.buddycloud.channeldirectory.crawler.node.CrawlerHelper.getAtomEntry(CrawlerHelper.java:215)
    at com.buddycloud.channeldirectory.crawler.node.PostCrawler.crawl(PostCrawler.java:100)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.crawl(PubSubServerCrawler.java:195)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.crawl(PubSubServerCrawler.java:185)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.fetchAndCrawl(PubSubServerCrawler.java:224)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.crawlChannelServer(PubSubServerCrawler.java:152)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.fetch(PubSubServerCrawler.java:119)
    at com.buddycloud.channeldirectory.crawler.PubSubServerCrawler.start(PubSubServerCrawler.java:93)
    at com.buddycloud.channeldirectory.crawler.Main.main(Main.java:63)
abmargb commented 9 years ago

Posts like this are simply ignored. I'm just wondering if I can escape special characters (or involve it in a CDATA block) so we don't ignore the post. Em 26/10/2014 09:06, "Simon Tennant" notifications@github.com escreveu:

Is this fixed or the nodes removed?

— Reply to this email directly or view it on GitHub https://github.com/buddycloud/channel-directory/issues/37#issuecomment-60513904 .