Open havardthom opened 5 years ago
So it's not stuck, just very very slow. 2 hours to inject one url.. currently at this stage:
./build/apache-nutch-2.3.1/runtime/local/bin/nutch inject ./seed/urls.txt
InjectorJob: starting at 2018-10-23 13:19:22
InjectorJob: Injecting urlDir: seed/urls.txt
InjectorJob: Using class org.apache.gora.hbase.store.HBaseStore as the Gora storage class.
InjectorJob: total number of urls rejected by filters: 0
InjectorJob: total number of urls injected after normalization and filtering: 1
Injector: finished at 2018-10-23 15:21:13, elapsed: 02:01:50
Generate urls:
./build/apache-nutch-2.3.1/runtime/local/bin/nutch generate -topN 5
GeneratorJob: starting at 2018-10-23 15:21:14
GeneratorJob: Selecting best-scoring urls due for fetch.
GeneratorJob: starting
GeneratorJob: filtering: true
GeneratorJob: normalizing: true
GeneratorJob: topN: 5
Hi, I just installed this crawler and I'm having an issue. Testing the crawler with just one URL and it seems to get stuck on the nutch InjectorJob, nothing happens after the following:
Installation and setup went fine, except some warning when I ran
./gradlew buildPlugin
:[ant:taskdef] Could not load definitions from resource org/sonar/ant/antlib.xml. It could not be found.
Any idea what might be wrong here?