fizx / robots

robots.txt parser
40 stars 17 forks source link

robots non functional at the moment? #2

Closed rb2k closed 14 years ago

rb2k commented 14 years ago

System: ruby-head and 1.9.1, using the newest gem version.
I came across the error over here: http://danielwebb.us/robots.txt

User-agent: *
Disallow: /bot-trap
Disallow: /about/contact
Disallow: /about/resume/daniel_webb-resume.pdf
Disallow: /projects/pd_tech_books/the_boy_electrician.pdf

But still:

ruby-head > robots.allowed?("http://danielwebb.us/bot-trap/index.php")
 => true 
ruby-head > robots.allowed?("http://danielwebb.us/bot-trap/")
 => true 
ruby-head > robots.allowed?("http://danielwebb.us/bot-trap")
 => true 

What is even more disturbing: ruby-head > robots.allowed?("http://www.google.de/search") => true

Did something break in the recent update? Did something break in one of the last updates?

rb2k commented 14 years ago

Tried with the version before the last commit:

ruby-head > require "./lib/robots.rb"
 => true 
ruby-head > bla = Robots.new("test")
 => #<Robots:0x00000100a60580 @user_agent="test", @parsed={}> 
ruby-head > bla.allowed?("http://www.google.de/search")
 => false 

That seems to be working

rb2k commented 14 years ago

Found the bug and fixed it, sent pull request:
http://github.com/rb2k/robots/commits/master

rb2k commented 14 years ago

closing