BackupGGCode / dataparksearch

An open source search engine for Internet and Intranet sites
GNU General Public License v2.0
1 stars 2 forks source link

Skip or Robots directive is ignored for hosts, having crawl-delay in robots.txt #45

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
"Realm skip http://host.com/*" is ignored, when indexer received Crawl-delay 
directive from host.com already from robots.txt

In Also ignored:

Robots no
Realm skip http://host.com/*

or 

Robots no
Server site http://host.com/

indexer keep crawl-delay'ing...

Original issue reported on code.google.com by kogcha...@gmail.com on 25 Dec 2012 at 4:41