mohankreddy / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

fetcher.PageFetcher: Failed: HTTP/1.1 400 Bad Request #121

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. running a crawler for more than 1 hour

What is the expected output? What do you see instead?

What version of the product are you using?
Newest.

Please provide any additional information below.
When I run the crawler for some time everything works perfect. After some time 
it starts giving me 400 Bad Request errors. The problem is that the link is 
good and if I curl it from the same machine which is running the crawler, it 
gets the whole file without any problem - which means this is not a ban.
What could be the issue?

Original issue reported on code.google.com by krzemien...@gmail.com on 7 Feb 2012 at 3:55

GoogleCodeExporter commented 9 years ago
I run the crawler for hours and days, there is no problem with the PageFetcher, 
you should try to enlarge the request time, maybe its to short and you flood 
your ISP or server.

Original comment by AldoCast...@gmail.com on 23 Apr 2013 at 11:12

GoogleCodeExporter commented 9 years ago
Works for me, a better scenario is required

Original comment by avrah...@gmail.com on 11 Aug 2014 at 1:09