mohankreddy / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

support for robots.txt #2

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Does crawler4j support for robots.txt?

Original issue reported on code.google.com by ruKy...@gmail.com on 15 Mar 2010 at 6:48

GoogleCodeExporter commented 9 years ago
No, the current version does not support robots.txt 

Original comment by ganjisaffar@gmail.com on 15 Mar 2010 at 6:55

GoogleCodeExporter commented 9 years ago

Original comment by ganjisaffar@gmail.com on 14 Apr 2010 at 7:39

GoogleCodeExporter commented 9 years ago
It not supporting robots.txt means it will get banned rather quickly by a lot 
of web masters.

Original comment by co...@dijkgraaf.org on 1 Nov 2010 at 12:14

GoogleCodeExporter commented 9 years ago
hi, i would like to setup the crawler to crawl a website, let say blog, and 
fetch me only the links in the website and paste the links inside a text file. 
Can you guide me step by step for setup the crawler ? i am using eclipse.
thanks for your attention. 

Original comment by smith.lu...@gmail.com on 16 Feb 2011 at 5:13

GoogleCodeExporter commented 9 years ago
This feature is implemented in version 2.6

-Yasser

Original comment by ganjisaffar@gmail.com on 12 Mar 2011 at 5:13