stewartmckee / cobweb

Web crawler with very flexible crawling options. Can either use standalone or can be used with resque to perform clustered crawls.
MIT License
226 stars 45 forks source link

external_urls not treated as external #16

Open stewartmckee opened 11 years ago

stewartmckee commented 11 years ago

External urls are not treated as external if they match the cache. A test should be done when retrieving from the cache to make sure that all criteria are checked as it may have changed since last crawl.