The crawler stops running further if the start url returns a 302 redirect.

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. if the start url returns a 302 redirect, the cralwer does not crawl further 
as the assignedURL array stays null and the crawler does the root redirect url 
to create the list of further links from the redirected content.
2.
3.

What is the expected output? What do you see instead?
The expected output should be to parse the initially movedToURL and build the 
urls to process array.

Instead the crawler does not parse any further and exits out.

What version of the product are you using?
4.3-3 (latest)

Please provide any additional information below.

Original issue reported on code.google.com by sun...@gmail.com on 10 May 2012 at 11:05

GoogleCodeExporter commented 9 years ago

Sorry the version is 3.3

Original comment by sun...@gmail.com on 10 May 2012 at 11:06

GoogleCodeExporter commented 9 years ago

Please provide an example URL

Original comment by avrah...@gmail.com on 11 Aug 2014 at 1:51

GoogleCodeExporter commented 9 years ago

Original comment by avrah...@gmail.com on 18 Aug 2014 at 3:19

Changed state: Accepted
Added labels: Priority-High
Removed labels: Priority-Medium

GoogleCodeExporter commented 9 years ago

I tried beginning the crawl with a redirect and it worked.

Unless you provide a specific example I will have to close this issue as invalid

Original comment by avrah...@gmail.com on 21 Aug 2014 at 9:37

GoogleCodeExporter commented 9 years ago

Works for me.

Original comment by avrah...@gmail.com on 23 Sep 2014 at 2:01

Changed state: Invalid

mohankreddy / crawler4j

The crawler stops running further if the start url returns a 302 redirect. #152