Closed OkkeKlein closed 9 years ago
Thanks for reporting this. I'm looking into it.
The risks of dealing with snapshots :-) I fixed that compile error in the Importer project caused by code change in PDFBox latest snapshot and deployed new snapshot release of both Importer and HTTP Collector.
Fixed.
Exception in thread "pool-1-thread-1" java.lang.NoSuchMethodError: org.apache.pdfbox.pdmodel.PDDocumentInformation.getDictionary()Lorg/apache/pdfbox/cos/COSDictionary; at org.apache.tika.parser.pdf.EnhancedPDFParser.extractMetadata(EnhancedPDFParser.java:300) at org.apache.tika.parser.pdf.EnhancedPDFParser.parse(EnhancedPDFParser.java:162) at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:117) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:117) at com.norconex.importer.parser.impl.AbstractTikaParser$MergeEmbeddedParser.parse(AbstractTikaParser.java:374) at com.norconex.importer.parser.impl.AbstractTikaParser.parseDocument(AbstractTikaParser.java:159) at com.norconex.importer.Importer.parseDocument(Importer.java:414) at com.norconex.importer.Importer.importDocument(Importer.java:314) at com.norconex.importer.Importer.doImportDocument(Importer.java:267) at com.norconex.importer.Importer.importDocument(Importer.java:195) at com.norconex.collector.core.pipeline.importer.ImportModuleStage.execute(ImportModuleStage.java:35) at com.norconex.collector.core.pipeline.importer.ImportModuleStage.execute(ImportModuleStage.java:26) at com.norconex.commons.lang.pipeline.Pipeline.execute(Pipeline.java:90) at com.norconex.collector.http.crawler.HttpCrawler.executeImporterPipeline(HttpCrawler.java:213) at com.norconex.collector.core.crawler.AbstractCrawler.processNextQueuedCrawlData(AbstractCrawler.java:473) at com.norconex.collector.core.crawler.AbstractCrawler.processNextReference(AbstractCrawler.java:373) at com.norconex.collector.core.crawler.AbstractCrawler$ProcessURLsRunnable.run(AbstractCrawler.java:631) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)