fizx / robots

robots.txt parser
40 stars 17 forks source link

Error in Parser? #3

Open rb2k opened 14 years ago

rb2k commented 14 years ago

This is what I do: ruby-head > bla = Robots.new("test") => #<Robots:0x000001008743c0 @user_agent="test", @parsed={}> ruby-head > bla.allowed?("http://lacostarecords.net/") => false

This is the robots.txt

Block a bot that was causing issues by ignoring Disallow lines below

User-Agent: OmniExplorer_Bot
Disallow: /
# Block hotlinking of music files by projectplaylist.com due to perceived user bandwidth theft
User-agent: projectplaylist-directlink
Disallow: /

# Block all bots from the core Homestead site
User-agent: *
Disallow: /~site/Scripts_ElementMailer
Disallow: /~site/Scripts_ExternalRedirect
Disallow: /~site/Scripts_ForSale
Disallow: /~site/Scripts_HitCounter
Disallow: /~site/Scripts_NewGuest
Disallow: /~site/Scripts_RealTracker
Disallow: /~site/Scripts_Track
Disallow: /~site/Scripts_WebPoll

it should return true, shouldn't it?

rb2k commented 14 years ago

same for http://pokerstarpro.net/robots.txt

User-agent: *
Disallow: /site/
Disallow: /common/forsale/
Disallow: /common/roar/redir
Disallow: /common/roar/results.htm
Disallow: /common/advertise/advertising.htm

User-Agent: MJ12bot
Disallow:   /

User-agent: ShopWiki
Disallow: /
rb2k commented 14 years ago

ok, seems that the different capitalization in User-Agent (User-agent) might be killing the parser