Closed sylvainroussy closed 8 years ago
This exception can sometimes be caused by too much recursion. It is likely sometime related to your specific document and what is being matched exactly by your DOM selector. Can you attach a copy of the file that is causing this issue? Maybe there is a way to change your selector to avoid this (or otherwise provide a fix).
Hi, Ok my source page has changed since my last message, it was a test of this component. I close this ticket and take note about your explanation. Thanks.
with the following configuration (crawling depth 0):
I get a StackOverflowError :
_java.lang.StackOverflowError at java.io.UnixFileSystem.getBooleanAttributes0(Native Method) at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242) at java.io.File.exists(File.java:819) at sun.misc.FileURLMapper.exists(FileURLMapper.java:78) at sun.misc.URLClassPath$JarLoader.getJarFile(URLClassPath.java:890) at sun.misc.URLClassPath$JarLoader.access$700(URLClassPath.java:756) at sun.misc.URLClassPath$JarLoader$1.run(URLClassPath.java:838) at sun.misc.URLClassPath$JarLoader$1.run(URLClassPath.java:831) at java.security.AccessController.doPrivileged(Native Method) at sun.misc.URLClassPath$JarLoader.ensureOpen(URLClassPath.java:830) at sun.misc.URLClassPath$JarLoader.(URLClassPath.java:803)
at sun.misc.URLClassPath$JarLoader$3.run(URLClassPath.java:1057)
at sun.misc.URLClassPath$JarLoader$3.run(URLClassPath.java:1054)
at java.security.AccessController.doPrivileged(Native Method)
at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1053)
at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1013)
at sun.misc.URLClassPath$JarLoader.findResource(URLClassPath.java:983)
at sun.misc.URLClassPath$1.next(URLClassPath.java:240)
at sun.misc.URLClassPath$1.hasMoreElements(URLClassPath.java:250)
at java.net.URLClassLoader$3$1.run(URLClassLoader.java:601)
at java.net.URLClassLoader$3$1.run(URLClassLoader.java:599)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader$3.next(URLClassLoader.java:598)
at java.net.URLClassLoader$3.hasMoreElements(URLClassLoader.java:623)
at sun.misc.CompoundEnumeration.next(CompoundEnumeration.java:45)
at sun.misc.CompoundEnumeration.hasMoreElements(CompoundEnumeration.java:54)
at java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:354)
at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:393)
at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:474)
at javax.xml.parsers.FactoryFinder$1.run(FactoryFinder.java:293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:289)
at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:267)
at javax.xml.parsers.SAXParserFactory.newInstance(SAXParserFactory.java:127)
at org.apache.tika.detect.XmlRootExtractor.extractRootElement(XmlRootExtractor.java:51)
at org.apache.tika.detect.XmlRootExtractor.extractRootElement(XmlRootExtractor.java:42)
at org.apache.tika.mime.MimeTypes.getMimeType(MimeTypes.java:206)
at org.apache.tika.mime.MimeTypes.detect(MimeTypes.java:472)
at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:77)
at com.norconex.importer.doc.ContentTypeDetector.doDetect(ContentTypeDetector.java:111)
at com.norconex.importer.doc.ContentTypeDetector.detect(ContentTypeDetector.java:75)
at com.norconex.importer.Importer.doImportDocument(Importer.java:233)
at com.norconex.importer.Importer.importDocument(Importer.java:195)
at com.norconex.importer.Importer.doImportDocument(Importer.java:280)
at com.norconex.importer.Importer.importDocument(Importer.java:195)
at com.norconex.importer.Importer.doImportDocument(Importer.java:280)
at com.norconex.importer.Importer.importDocument(Importer.java:195)
at com.norconex.importer.Importer.doImportDocument(Importer.java:280)
[...]_