everypolitician / scraped_page_archive

Create an archive of HTML pages scraped by a Ruby scraper
MIT License
1 stars 0 forks source link

Error when using GitStorage #48

Closed chrismytton closed 8 years ago

chrismytton commented 8 years ago

Problem

I'm seeing the following error when trying to use this gem:

/app/vendor/bundle/ruby/2.0.0/bundler/gems/scraped_page_archive-9d8a6347d122/lib/scraped_page_archive/git_storage.rb:32:in `save': undefined local variable or method `ret' for #<ScrapedPageArchive::GitStorage:0x007fa0f35c4450> (NameError)
    from /app/vendor/bundle/ruby/2.0.0/bundler/gems/scraped_page_archive-9d8a6347d122/lib/scraped_page_archive.rb:37:in `record'
    from /app/vendor/bundle/ruby/2.0.0/bundler/gems/scraped_page_archive-9d8a6347d122/lib/scraped_page_archive.rb:18:in `record'
    from /app/vendor/bundle/ruby/2.0.0/bundler/gems/scraped_page_archive-9d8a6347d122/lib/scraped_page_archive/capybara.rb:57:in `command'
    from /app/vendor/bundle/ruby/2.0.0/gems/poltergeist-1.9.0/lib/capybara/poltergeist/browser.rb:75:in `visible_text'
    from /app/vendor/bundle/ruby/2.0.0/gems/poltergeist-1.9.0/lib/capybara/poltergeist/node.rb:17:in `command'
    from /app/vendor/bundle/ruby/2.0.0/gems/poltergeist-1.9.0/lib/capybara/poltergeist/node.rb:50:in `visible_text'
    from /app/vendor/bundle/ruby/2.0.0/gems/capybara-2.6.2/lib/capybara/node/element.rb:61:in `block in text'
    from /app/vendor/bundle/ruby/2.0.0/gems/capybara-2.6.2/lib/capybara/node/base.rb:84:in `synchronize'
    from /app/vendor/bundle/ruby/2.0.0/gems/capybara-2.6.2/lib/capybara/node/element.rb:57:in `text'
    from scraper.rb:145:in `block in scrape_people'
    from scraper.rb:144:in `scrape_people'
    from scraper.rb:289:in `<main>'

That error was specifically from running https://github.com/everypolitician-scrapers/spain_congreso_es/tree/750cf4576626570b1b86156e866e65c0ffefc5f7.

Proposed solution

It looks like there is a stray ret variable being referenced that doesn't exist, introduced in https://github.com/everypolitician/scraped_page_archive/pull/42. So the fix is to remove that variable (and possibly add a regression test!).