edgi-govdata-archiving / versionista-outputter

ARCHIVED--A Ruby script that scrapes Versionista's web interface to generate a csv summarizing which websites and pages have had recent changes.
2 stars 0 forks source link

Outputter failed for 96 hour run #7

Closed mayaad closed 7 years ago

mayaad commented 7 years ago

Ran the script to pull down past 96h changes, and got the following error. Am able to successfully run the script to pull down changes to past 1h (verified with Versionista).

/Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/node/finders.rb:214:in `block in all': expected to find frame nil at least 1 time but there were no matches (Capybara::ExpectationNotMet) 
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/node/base.rb:85:in `synchronize'
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/node/finders.rb:212:in `all'
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/session.rb:780:in `block (2 levels) in <class:Session>'
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/session.rb:400:in `block in within_frame' 
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/session.rb:310:in `within'
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/session.rb:390:in `within_frame'
    from capybara_script.rb:191:in `comparison_diff' 
    from capybara_script.rb:145:in `block in scrape_archived_page_data'
    from capybara_script.rb:136:in `map'

    from capybara_script.rb:136:in `scrape_archived_page_data'
    from capybara_script.rb:50:in `block in scrape_each_page_version'
    from capybara_script.rb:48:in `map'
    from capybara_script.rb:48:in `scrape_each_page_version' 
    from capybara_script.rb:234:in `\<main>'
mayaad commented 7 years ago

I tried a 10h run, and this completed successfully in 21 minutes.

mayaad commented 7 years ago

A 50h run just failed with the following message:

`/Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/node/finders.rb:44:in `block in find': Unable to find option "source: changes only" (Capybara::ElementNotFound)
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/node/base.rb:85:in `synchronize'
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/node/finders.rb:33:in `find'
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/node/actions.rb:186:in `select'
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/session.rb:780:in `block (2 levels) in <class:Session>'
    from capybara_script.rb:193:in `block in comparison_diff'
    from /Users/mayaanjurdietrich/.rvm/gems/ruby-2.2.3/gems/capybara-2.12.0/lib/capybara/session.rb:409:in `within_frame'
    from capybara_script.rb:191:in `comparison_diff'
    from capybara_script.rb:145:in `block in scrape_archived_page_data'
    from capybara_script.rb:136:in `map'
    from capybara_script.rb:136:in `scrape_archived_page_data'
    from capybara_script.rb:50:in `block in scrape_each_page_version'
    from capybara_script.rb:48:in `map'
    from capybara_script.rb:48:in `scrape_each_page_version'`
KrishnaKulkarni commented 7 years ago

@mayaad @trinberg Pushed up some work that should fix this

mayaad commented 7 years ago

Looks like it's working! Running well so far, and I'll do some manual checking when it's done to verify.

New output from the script shows that it's using the rescue:

Visiting the comparison url: https://versionista.com/74286/6216597/9972729:0/
-- Successful visit!
__Error getting diff from: https://versionista.com/74286/6216597/9972729:0/
--------------------------------------------------------------------------------
Visiting https://versionista.com/74286/6215867/
-- Successful visit! 
mayaad commented 7 years ago

Ran successfully for both accounts, and the sheets I've looked at have checked out so far. Thanks!

titaniumbones commented 7 years ago

Now you just have to teach everyone else how to use it!

On February 19, 2017 12:11:49 PM EST, mayaad notifications@github.com wrote:

Ran successfully for both accounts, and the sheets I've looked at have checked out so far. Thanks!

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/edgi-govdata-archiving/versionista-outputter/issues/7#issuecomment-280932814

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.