Closed kalhomoud closed 9 years ago
There is a snapshot release of Importer (2.1.0-SNAPSHOT) available for download that uses an updated version of Tika. It would be nice if you could try to reproduce with that version to confirm whether that's an issue Apache fixed already.
Sorry, with all the Github emails I'm receiving, I must have missed this one. I will try to reproduce it now.
I can confirm that this issue is no longer there with MP4 files.
INFO [CrawlerEventManager] DOCUMENT_METADATA_FETCHED: file:///Users/alhomoud/Music/small.mp4 (Subject: com.norconex.collector.fs.pipeline.importer.FileImporterPipeline$FileMetadataFetcherStage@1aba0e25) INFO [CrawlerEventManager] DOCUMENT_FETCHED: file:///Users/alhomoud/Music/small.mp4 (Subject: com.norconex.collector.fs.pipeline.importer.FileImporterPipeline$DocumentFetchStage@29e9dad8)
Thanks for the feedback.
Part of 2.1.0 release.
It seems like a library is missing for MP4 parsing: Exception in thread "pool-1-thread-1" INFO [FilesystemCrawler] Projects: Re-processing orphan Files (if any)... java.lang.NoClassDefFoundError: org/aspectj/lang/Signature at org.apache.tika.parser.mp4.MP4Parser.parse(MP4Parser.java:117) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91) at com.norconex.importer.parser.impl.AbstractTikaParser$RecursiveMetadataParser.parse(AbstractTikaParser.java:133) at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) at org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:169) at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:135) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91) at com.norconex.importer.parser.impl.AbstractTikaParser$RecursiveMetadataParser.parse(AbstractTikaParser.java:133) at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) at org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:169) at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:135) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91) at com.norconex.importer.parser.impl.AbstractTikaParser$RecursiveMetadataParser.parse(AbstractTikaParser.java:133) at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) at org.apache.tika.parser.pkg.CompressorParser.parse(CompressorParser.java:143) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:91) at com.norconex.importer.parser.impl.AbstractTikaParser$RecursiveMetadataParser.parse(AbstractTikaParser.java:133) at com.norconex.importer.parser.impl.AbstractTikaParser.parseDocument(AbstractTikaParser.java:99) at com.norconex.importer.Importer.parseDocument(Importer.java:379) at com.norconex.importer.Importer.importDocument(Importer.java:266) at com.norconex.collector.fs.crawler.DocumentProcessor$ImportModuleStep.processDocument(DocumentProcessor.java:146) at com.norconex.collector.fs.crawler.DocumentProcessor.processURL(DocumentProcessor.java:91) at com.norconex.collector.fs.crawler.FilesystemCrawler.processNextQueuedFile(FilesystemCrawler.java:384) at com.norconex.collector.fs.crawler.FilesystemCrawler.processNextFile(FilesystemCrawler.java:310) at com.norconex.collector.fs.crawler.FilesystemCrawler.access$100(FilesystemCrawler.java:62) at com.norconex.collector.fs.crawler.FilesystemCrawler$ProcessFilesRunnable.run(FilesystemCrawler.java:545) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassNotFoundException: org.aspectj.lang.Signature at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 44 more