asepaprianto / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

New feature: Add more context in shouldVisit #160

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Feature suggestion: Add more context in shouldVisit-method.

New method in WebCrawler: shouldVisit(Page page, WebURL next)

If the page has not been downloaded yet (as is the case with redirects), the 
page should be set to null

The current shouldVisit-method is good enough for most cases, but sometimes it 
is helpful to also be able to check the context in which the WebURL has been 
found. 

For backwards compatiblity, I suggest that the default implementation calls the 
old method

public boolean shouldVisit(WebURL url, Page page) {
   return shouldVisit(WebURL url);
}

Original issue reported on code.google.com by mstrofpp...@gmail.com on 8 Jun 2012 at 1:22

GoogleCodeExporter commented 9 years ago

Original comment by avrah...@gmail.com on 18 Aug 2014 at 3:21

GoogleCodeExporter commented 9 years ago
Fixed in revision: c874761011d6

Original comment by avrah...@gmail.com on 22 Aug 2014 at 1:16