-
Before I did anything with this, wanted to check and see if there was a reason why HTTP 1.0 was hardcoded. Right now, line 167 of HttpResponse sets the protocol version to 1.0, regardless of whether `…
-
```
What steps will reproduce the problem?
Running the crawler crashes the JVM some times. I crawl around 10 web sites
regularly with pages between 1K to 50K. This happens randomly but happens very …
-
```
What steps will reproduce the problem?
Running the crawler crashes the JVM some times. I crawl around 10 web sites
regularly with pages between 1K to 50K. This happens randomly but happens very …
-
```
See "Order of precedence for group-member records" section at the end of
https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
```
Original issue reported on code.google…
-
```
See "Order of precedence for group-member records" section at the end of
https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
```
Original issue reported on code.google…
-
```
We have loads of fine grained method available to us via FetchedResult.
I think it would be really cool however if we were able to print a report of
the FetchedResult including some timing statis…
-
Hi Folks,
NOTICE file should be similar to the following
https://github.com/apache/nutch/blob/trunk/NOTICE.txt
License file headers are important.
They should all be Apache v2.0 License Headers
-
```
See "Order of precedence for group-member records" section at the end of
https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
```
Original issue reported on code.google…
-
```
What steps will reproduce the problem?
Running the crawler crashes the JVM some times. I crawl around 10 web sites
regularly with pages between 1K to 50K. This happens randomly but happens very …
-
```
What steps will reproduce the problem?
Running the crawler crashes the JVM some times. I crawl around 10 web sites
regularly with pages between 1K to 50K. This happens randomly but happens very …