stewartmckee / cobweb

Web crawler with very flexible crawling options. Can either use standalone or can be used with resque to perform clustered crawls.
MIT License
226 stars 45 forks source link

Improved handling of redirects #9

Closed rojotek closed 11 years ago

rojotek commented 11 years ago

Stew,

I made another tweak to crawl_job to help with redirects.

I moved the url from the queued list to the crawled list - for both the original url, and the content url (to handle redirects).