2018-12-08 10:32:15,172 [main] ERROR [o.d.s.a.i.RDFAnalyzer ] - <Exception while analyzing. Aborting. >
org.apache.jena.riot.RiotException: Failed to determine the content type: (URI=/tmp/fetched_7307247588223997472 : stream=null)
at org.apache.jena.riot.RDFDataMgr.process(RDFDataMgr.java:856)
at org.apache.jena.riot.RDFDataMgr.parse(RDFDataMgr.java:667)
at org.apache.jena.riot.RDFDataMgr.parse(RDFDataMgr.java:637)
at org.apache.jena.riot.RDFDataMgr.parse(RDFDataMgr.java:626)
at org.dice_research.squirrel.analyzer.impl.RDFAnalyzer.analyze(RDFAnalyzer.java:65)
at org.dice_research.squirrel.analyzer.manager.SimpleAnalyzerManager.analyze(SimpleAnalyzerManager.java:99)
at org.dice_research.squirrel.worker.impl.WorkerImpl.performCrawling(WorkerImpl.java:267)
at org.dice_research.squirrel.worker.impl.WorkerImpl.crawl(WorkerImpl.java:206)
at org.dice_research.squirrel.worker.impl.WorkerImpl.run(WorkerImpl.java:150)
at org.dice_research.squirrel.components.WorkerComponent.run(WorkerComponent.java:131)
at org.dice_research.squirrel.components.WorkerComponentStarter.main(WorkerComponentStarter.java:42)
2018-12-08 10:32:15,299 [main] ERROR [o.d.s.w.i.WorkerImpl ] - <Unhandled exception while crawling "http://creativecommons.org/wp-content/plugins/jetpack/modules/theme-tools/compat/twentysixteen.css". It will be ignored.>
java.lang.NullPointerException
at org.dice_research.squirrel.worker.impl.WorkerImpl.sendNewUris(WorkerImpl.java:334)
at org.dice_research.squirrel.worker.impl.WorkerImpl.performCrawling(WorkerImpl.java:268)
at org.dice_research.squirrel.worker.impl.WorkerImpl.crawl(WorkerImpl.java:206)
at org.dice_research.squirrel.worker.impl.WorkerImpl.run(WorkerImpl.java:150)
at org.dice_research.squirrel.components.WorkerComponent.run(WorkerComponent.java:131)
at org.dice_research.squirrel.components.WorkerComponentStarter.main(WorkerComponentStarter.java:42)
Problem
The worker logs