fizx / robots

robots.txt parser
40 stars 17 forks source link

add URI to timeout message #4

Open rb2k opened 14 years ago

rb2k commented 14 years ago

It would be nice if the "robots.txt request timed out" would tell for which uri it actually failed. This is the method in question:

  def self.get_robots_txt(uri, user_agent)
    begin
      Timeout::timeout(Robots.timeout) do
        io = URI.join(uri.to_s, "/robots.txt").open("User-Agent" => user_agent) rescue nil
      end 
    rescue Timeout::Error
      STDERR.puts "robots.txt request timed out"
    end
  end