Feature suggestion: Add more context in shouldVisit-method.
New method in WebCrawler: shouldVisit(Page page, WebURL next)
If the page has not been downloaded yet (as is the case with redirects), the
page should be set to null
The current shouldVisit-method is good enough for most cases, but sometimes it
is helpful to also be able to check the context in which the WebURL has been
found.
For backwards compatiblity, I suggest that the default implementation calls the
old method
public boolean shouldVisit(WebURL url, Page page) {
return shouldVisit(WebURL url);
}
Original issue reported on code.google.com by mstrofpp...@gmail.com on 8 Jun 2012 at 1:22
Original issue reported on code.google.com by
mstrofpp...@gmail.com
on 8 Jun 2012 at 1:22