reading-am / reading

The dreams! The dreams! It's all just absurdity in the light of day but the dreams!
MIT License
10 stars 0 forks source link

Pinterest blocks the crawler #401

Open leppert opened 11 years ago

leppert commented 11 years ago

When the crawl gets denied, we should fall back to the data sent from the bookmarklet.

See also: http://stackoverflow.com/questions/15867047/pinterest-api-returning-403-on-ec2-instance

This link: http://pinterest.com/pin/16818198578652743/ Throws this error:

Mechanize::ResponseCodeError: 403 => Net::HTTPForbidden for http://pinterest.com/pin/16818198578652743/ -- unhandled response
    from /app/vendor/bundle/ruby/2.0.0/gems/mechanize-2.7.1/lib/mechanize/http/agent.rb:306:in `fetch'
    from /app/vendor/bundle/ruby/2.0.0/gems/mechanize-2.7.1/lib/mechanize.rb:431:in `get'
    from /app/app/models/page.rb:312:in `mech'
    from /app/app/models/page.rb:318:in `remote_resolved_url'
    from /app/app/models/page.rb:331:in `remote_canonical_url'
    from /app/app/models/page.rb:322:in `remote_normalized_url'
    from /app/app/models/page.rb:88:in `find_by_url'
    from /app/app/models/page.rb:99:in `find_or_create_by_url'
    from (irb):1
    from /app/vendor/bundle/ruby/2.0.0/gems/railties-4.0.0/lib/rails/commands/console.rb:90:in `start'
    from /app/vendor/bundle/ruby/2.0.0/gems/railties-4.0.0/lib/rails/commands/console.rb:9:in `start'
    from /app/vendor/bundle/ruby/2.0.0/gems/railties-4.0.0/lib/rails/commands.rb:64:in `<top (required)>'
    from bin/rails:4:in `require'
    from bin/rails:4:in `<main>'