bbc / wraith

Wraith — A responsive screenshot comparison tool
http://bbc-news.github.io/wraith/
Apache License 2.0
4.84k stars 358 forks source link

Loading over 500 urls within paths section from a text file #445

Open rayferns opened 8 years ago

rayferns commented 8 years ago

I have a config file to test a website with javascript and cookies disabled. The config file has 565 paths to do a screengrab. Here are teh challenges I'm facing:

  1. Rather than hardcoding all the urls path to compare screenshots, I prefer to have the urls read from a csv or text file
  2. Secondly, Screencapturing of over 565 urls screens takes over 5 hours, is there a way to speed this up? PhantomJS freezs after screengrabs, the compare process does not start.

Would appreciate your answers to the above questions.


Reporting a problem? Please describe the issue above, and complete the following checklist so that we can help you more quickly.

Issue checklist:

C:\VisualRegression>wraith capture configs\vw08_test_0107.yaml
DEBUG: #################################################
DEBUG:   Command run:        capture configs\vw08_test_0107.yaml
DEBUG:   Wraith version:     3.2.0
DEBUG:   Ruby version:       ruby 2.3.0p0 (2015-12-25 revision 53290) [x64-mingw
32]

DEBUG:   ImageMagick:        Version: ImageMagick 7.0.2-1 Q16 x64 2016-06-23 htt
p://www.imagemagick.org

DEBUG:   PhantomJS version:  1.9.6

DEBUG:   CasperJS version:   1.1.1

DEBUG: #################################################
DEBUG:
Config validated. No serious issues found.
no paths defined in config, crawling from site root
C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/wraith-3.2.0/lib/wraith/spider.rb:64:in `
spider': undefined local variable or method `wraith' for #<Wraith::Crawler:0x000
00003e48df8> (NameError)
Did you mean?  @wraith
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/wraith-3.2.0/lib/wraith/spid
er.rb:36:in `determine_paths'
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/wraith-3.2.0/lib/wraith/spid
er.rb:24:in `check_for_paths'
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/wraith-3.2.0/lib/wraith/cli.
rb:36:in `check_for_paths'
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/wraith-3.2.0/lib/wraith/cli.
rb:134:in `block in capture'
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/wraith-3.2.0/lib/wraith/cli.
rb:28:in `within_acceptable_limits'
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/wraith-3.2.0/lib/wraith/cli.
rb:131:in `capture'
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/thor-0.19.1/lib/thor/command
.rb:27:in `run'
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/thor-0.19.1/lib/thor/invocat
ion.rb:126:in `invoke_command'
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/thor-0.19.1/lib/thor.rb:359:
in `dispatch'
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/thor-0.19.1/lib/thor/base.rb
:440:in `start'
        from C:/Ruby23-x64/lib/ruby/gems/2.3.0/gems/wraith-3.2.0/bin/wraith:5:in
 `<top (required)>'
        from C:/Ruby23-x64/bin/wraith:23:in `load'
        from C:/Ruby23-x64/bin/wraith:23:in `<main>'
paste results here
paste config here
#debug mode
verbose: true
#Headless browser option
browser:
  phantomjs: "phantomjs"
  # slimerjs: "slimerjs"

before_capture: 'javascript/disable_cookies.js'

#If you want to have multiple snapping files, set the file name here
#snap_file: "javascript/snap.js"

# Type the name of the directory that shots will be stored in
directory: 'shots'

#As per https://github.com/BBC-News/wraith/issues/308 - to read in https urls
phantomjs_options: --ssl-protocol=tlsv1 --ignore-ssl-errors=yes

# Add only 2 domains, key will act as a label
domains:
  Live: "http://www.volkswagen.co.uk"
  VW08: "http://origin-vw08.vwcloud.co.uk"

#Type screen widths below, here are a couple of examples
screen_widths:
    - 600x800
    - 1024x600
    - 1280x960
    - 2560x1600
    - 768x1024
    - 1280x800
    - 1080x1920
    - 320x533
    - 750x1334

#Type page URL paths below, here are a couple of examples
paths:
    # Home: /
    # Owners-My: /owners/my
imports: 'Files/test1.txt'

#Amount of fuzz ImageMagick will use
fuzz: '20%'

#Set the number of days to keep the site spider file
spider_days:
  - 10

#Choose how results are displayed, by default alphanumeric. Different screen widths are always grouped.
#alphanumeric - all paths (with, and without, a difference) are shown, sorted by path
#diffs_first - all paths (with, and without, a difference) are shown, sorted by difference size (largest first)
#diffs_only - only paths with a difference are shown, sorted by difference size (largest first)
mode: diffs_first

threshold: 5

# Examples: 'basic_template' (default), 'slideshow_template'
# gallery:
  # template: 'slideshow_template'
    # thumb_width:  200
    # thumb_height: 200
----------------------------
Text file has following input:
 Home: /
 Owners-My: /owners/my
wjanoti commented 8 years ago

Hi, I have also a config file with a high numbers of URLs to test and one thing that really sped up the process was putting resize_or_reload: reloadon the config file.

balaspace commented 8 years ago

@wjanoti how do you import a file into the config.yaml ?

wjanoti commented 8 years ago

Hi @balaspace , I created a simple script to update the config.yaml file with the URLs I needed.

balaspace commented 8 years ago

cheers, thanks

mramitanand commented 7 years ago

So I am doing a capture with 1000 urls and i get the following error at the end when it is trying to create the gallery. If I do a small amount like 5-10 it works fine so dont think its my ruby version. Anybody have any ideas? Thanks

GENERATING THUMBNAILS GENERATING GALLERY /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/gallery.rb:71:in get_path': undefined method[]' for nil:NilClass (NoMethodError) from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/gallery.rb:65:in figure_out_url' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/gallery.rb:53:inmatcher' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/gallery.rb:40:in block (2 levels) in match' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/gallery.rb:38:inforeach' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/gallery.rb:38:in block in match' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/gallery.rb:35:ineach' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/gallery.rb:35:in match' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/gallery.rb:31:inparse_directories' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/gallery.rb:142:in generate_gallery' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/cli.rb:114:inblock in generate_gallery' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/helpers/utilities.rb:4:in within_acceptable_limits' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/cli.rb:111:ingenerate_gallery' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/cli.rb:128:in block in capture' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/helpers/utilities.rb:4:inwithin_acceptable_limits' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/lib/wraith/cli.rb:120:in capture' from /usr/local/rvm/gems/ruby-2.3.3/gems/thor-0.19.4/lib/thor/command.rb:27:inrun' from /usr/local/rvm/gems/ruby-2.3.3/gems/thor-0.19.4/lib/thor/invocation.rb:126:in invoke_command' from /usr/local/rvm/gems/ruby-2.3.3/gems/thor-0.19.4/lib/thor.rb:369:indispatch' from /usr/local/rvm/gems/ruby-2.3.3/gems/thor-0.19.4/lib/thor/base.rb:444:in start' from /usr/local/rvm/gems/ruby-2.3.3/gems/wraith-4.0.1/bin/wraith:5:in<top (required)>' from /usr/local/rvm/gems/ruby-2.3.3/bin/wraith:22:in load' from /usr/local/rvm/gems/ruby-2.3.3/bin/wraith:22:in

' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in eval' from /usr/local/rvm/gems/ruby-2.3.3/bin/ruby_executable_hooks:15:in
'

mramitanand commented 7 years ago

Actually I discovered what is wrong - in my paths section I had a few entries like 404: 404.html For some reason it wont take numeric it looks like.