asepaprianto / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

Update/Delete URLs, functionality #277

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi Team,
Right now the crawler4j crawl a URLs in the seed list and parse every link that 
it found and so on as a new, but is there a way to update/delete does URLs??

Original issue reported on code.google.com by edgar.ri...@gmail.com on 14 Aug 2014 at 5:23

GoogleCodeExporter commented 9 years ago
I am not sure I understood your question.

What exactly do you need ?

Original comment by avrah...@gmail.com on 14 Aug 2014 at 12:46

GoogleCodeExporter commented 9 years ago
What is the best way to identify that a URL content has changed?
it does exist the refresh interval functionality? (The refresh interval is 
sometimes referred to as the crawl cycle, refresh cycle or simply refresh)

Original comment by edgar.ri...@gmail.com on 18 Aug 2014 at 5:05

GoogleCodeExporter commented 9 years ago
I am moving this discussion to the forum as this is not a bug or a feature 
request.

https://groups.google.com/forum/#!topic/crawler4j/YpW1YKN6ntQ

Original comment by avrah...@gmail.com on 18 Aug 2014 at 7:21