Closed laurentprudhon closed 5 years ago
Take advantage of the checkpoint/restart capability, and of the new params trace file, to add the following important feature :
Read a list of Urls ro exclude from the crawl.
(we should reuse the robots exclusion engine)
Take advantage of the checkpoint/restart capability, and of the new params trace file, to add the following important feature :
Read a list of Urls ro exclude from the crawl.
(we should reuse the robots exclusion engine)