opensangja / abot

Automatically exported from code.google.com/p/abot
Apache License 2.0
0 stars 0 forks source link

Create a PoliteWebCrawler #9

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Create a PoliteWebCrawler.

-Add throttling
-Add manual crawl delay
-Add respect robots crawl delay
-Add respect robots disallow directive
-Add respect meta robots no index no follow

Original issue reported on code.google.com by sjdir...@gmail.com on 27 Sep 2012 at 11:47

GoogleCodeExporter commented 9 years ago

Original comment by sjdir...@gmail.com on 27 Sep 2012 at 11:49

GoogleCodeExporter commented 9 years ago

Original comment by sjdir...@gmail.com on 27 Sep 2012 at 11:52

GoogleCodeExporter commented 9 years ago

Original comment by sjdir...@gmail.com on 5 Jan 2013 at 7:50

GoogleCodeExporter commented 9 years ago

Original comment by sjdir...@gmail.com on 1 Mar 2013 at 12:55

GoogleCodeExporter commented 9 years ago
Completed all but throttling and robots no follow. Created issues 74 and 75 to 
address these specifically.

Original comment by sjdir...@gmail.com on 1 Mar 2013 at 1:04