I had download this whole source code and built it successfully. When i tried to run a crawl test:
bin/crawl urls/ TestCrawl/ http://localhost:8983/solr/nutch 2
I run into this URI path name issue.
hadoop.log.zip
i have this log file attached. It seems the HDFS file path name special characters issue is still there?
2016-01-03 13:27:08,405 INFO fetcher.Fetcher - Fetcher: starting at 2016-01-03 13:27:08
2016-01-03 13:27:08,405 INFO fetcher.Fetcher - Fetcher: segment: TestCrawl/segments/drwxr-xr-xnn4nstevennstaffnn136nJannn3n13:24n20160103090925
2016-01-03 13:27:08,406 INFO fetcher.Fetcher - Fetcher Timelimit set for : 1451809628406
2016-01-03 13:27:08,631 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-01-03 13:27:08,677 ERROR fetcher.Fetcher - Fetcher: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: drwxr-xr-xnn4nstevennstaffnn136nJannn3n13:24n20160103090925
at org.apache.hadoop.fs.Path.initialize(Path.java:148)
at org.apache.hadoop.fs.Path.(Path.java:126)
at org.apache.hadoop.fs.Path.(Path.java:50)
Or simply use the following command to start the crawl:
runtime/local/bin/nutch crawl urls/ -solr http://localhost:8983/solr/ -dir TestCrawl -depth 3 -topN 50
I had download this whole source code and built it successfully. When i tried to run a crawl test:
bin/crawl urls/ TestCrawl/ http://localhost:8983/solr/nutch 2
I run into this URI path name issue. hadoop.log.zipi have this log file attached. It seems the HDFS file path name special characters issue is still there?
2016-01-03 13:27:08,405 INFO fetcher.Fetcher - Fetcher: starting at 2016-01-03 13:27:08 2016-01-03 13:27:08,405 INFO fetcher.Fetcher - Fetcher: segment: TestCrawl/segments/drwxr-xr-xnn4nstevennstaffnn136nJannn3n13:24n20160103090925 2016-01-03 13:27:08,406 INFO fetcher.Fetcher - Fetcher Timelimit set for : 1451809628406 2016-01-03 13:27:08,631 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2016-01-03 13:27:08,677 ERROR fetcher.Fetcher - Fetcher: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: drwxr-xr-xnn4nstevennstaffnn136nJannn3n13:24n20160103090925 at org.apache.hadoop.fs.Path.initialize(Path.java:148) at org.apache.hadoop.fs.Path.(Path.java:126)
at org.apache.hadoop.fs.Path.(Path.java:50)