http://wikipedia.org/wiki/Xkcdhttp://wikipedia.org/wiki/Webcomichttp://wikipedia.org/wiki/Comicshttp://wikipedia.org/wiki/Picture_bookhttp://wikipedia.org/wiki/Heinrich_Hoffmann_
/usr/lib/ruby/1.8/open-uri.rb:277:in open_http': 403 Forbidden (OpenURI::HTTPError) from /usr/lib/ruby/1.8/open-uri.rb:616:inbuffer_open'
from /usr/lib/ruby/1.8/open-uri.rb:164:in open_loop' from /usr/lib/ruby/1.8/open-uri.rb:162:incatch'
from /usr/lib/ruby/1.8/open-uri.rb:162:in open_loop' from /usr/lib/ruby/1.8/open-uri.rb:132:inopen_uri'
from /usr/lib/ruby/1.8/open-uri.rb:518:in open' from /usr/lib/ruby/1.8/open-uri.rb:30:inopen'
from ./run.rb:95:in get_pages_first_link' from ./run.rb:116:inadd_crumb'
from ./run.rb:130:in crawl' from ./run.rb:24:ininitialize'
from ./run.rb:137:in `new'
from ./run.rb:137
The last link has parentheses in it, but other links with parentheses (try Mathematics) work fine.
Running this on my box works fine, this is likely an issue with open-uri/Nokogiri I recommend you open a bug report over there. I also see you're running 1.8.x, this project has only been tested with 1.9.2.
Having fixed my other issue, I do
./run.rb Xkcd
for the following output:
http://wikipedia.org/wiki/Xkcd http://wikipedia.org/wiki/Webcomic http://wikipedia.org/wiki/Comics http://wikipedia.org/wiki/Picture_book http://wikipedia.org/wiki/Heinrich_Hoffmann_ /usr/lib/ruby/1.8/open-uri.rb:277:in
open_http': 403 Forbidden (OpenURI::HTTPError) from /usr/lib/ruby/1.8/open-uri.rb:616:in
buffer_open' from /usr/lib/ruby/1.8/open-uri.rb:164:inopen_loop' from /usr/lib/ruby/1.8/open-uri.rb:162:in
catch' from /usr/lib/ruby/1.8/open-uri.rb:162:inopen_loop' from /usr/lib/ruby/1.8/open-uri.rb:132:in
open_uri' from /usr/lib/ruby/1.8/open-uri.rb:518:inopen' from /usr/lib/ruby/1.8/open-uri.rb:30:in
open' from ./run.rb:95:inget_pages_first_link' from ./run.rb:116:in
add_crumb' from ./run.rb:130:incrawl' from ./run.rb:24:in
initialize' from ./run.rb:137:in `new' from ./run.rb:137The last link has parentheses in it, but other links with parentheses (try Mathematics) work fine.