-
intentionally? does this need to be restored? Does the REST API support either of these?
-
Needs a discussion with Brittain, I don't think this is hard but it is pretty useful. It would be good to chat with the Nutch folks and ask them what other kinds of things are available and make sens…
-
```
What steps will reproduce the problem?
Running the crawler crashes the JVM some times. I crawl around 10 web sites
regularly with pages between 1K to 50K. This happens randomly but happens very …
-
```
We have loads of fine grained method available to us via FetchedResult.
I think it would be really cool however if we were able to print a report of
the FetchedResult including some timing statis…
-
```
We have loads of fine grained method available to us via FetchedResult.
I think it would be really cool however if we were able to print a report of
the FetchedResult including some timing statis…
-
```
What steps will reproduce the problem?
Running the crawler crashes the JVM some times. I crawl around 10 web sites
regularly with pages between 1K to 50K. This happens randomly but happens very …
-
I used the following command to initialize the Nutch object.
```
nt = Nutch('crawlTest', urlDir='urls/', serverEndpoint='http://localhost:8081')
```
But it gave me the following error
```
nutch.py:…
-
Bug report from Shadi Saleh propatrio@gmail.com
When installing, the following errors occur
```
[INFO] Scanning for projects...
[INFO]
[INFO]
----------------------------------------------------…
-
Hi,
I am trying to run HiBench for Nutch indexing. When I try to generate the data for 2Million pages, I get the rollowing error after Map 100% and reduce 100%.If anyone has faced similar issue, ple…
-
When tried to use link_reader instead of sequence_reader with the following command,
```
import os
import nutchpy
path = os.path.dirname(nutchpy.__file__)
path = os.path.join(path,"ex_data", "crawld…