bbc / wraith

Wraith — A responsive screenshot comparison tool
http://bbc-news.github.io/wraith/
Apache License 2.0
4.83k stars 356 forks source link

unable to spider - fails on imported config check #498

Closed elmofromok closed 7 years ago

elmofromok commented 7 years ago

I am trying to run wraith spider and have it build my paths file, but it is failing when it is not able to find the spiders_paths.yaml file.

I see this error:

ERROR: unable to find referenced imported config "spider_paths.yaml"

It should create this file if it does not exist, correct?


Reporting a problem? Please describe the issue above, and complete the following checklist so that we can help you more quickly.

Issue checklist:

>wraith info                                                                    299ms  Thu Dec 15 09:51:42 2016
DEBUG: #################################################
DEBUG:   Command run:        info
DEBUG:   Wraith version:     4.0.0
DEBUG:   Ruby version:       ruby 2.0.0p648 (2015-12-16 revision 53162) [universal.x86_64-darwin16]
DEBUG:   ImageMagick:        Version: ImageMagick 6.9.6-6 Q16 x86_64 2016-11-25 http://www.imagemagick.org
DEBUG:   PhantomJS version:  2.1.1
DEBUG:   CasperJS version:   CasperJS not installed
DEBUG: #################################################
>wraith spider capture.yaml                                                 365ms  Thu Dec 15 09:50:08 2016
DEBUG: #################################################
DEBUG:   Command run:        spider capture.yaml
DEBUG:   Wraith version:     4.0.0
DEBUG:   Ruby version:       ruby 2.0.0p648 (2015-12-16 revision 53162) [universal.x86_64-darwin16]
DEBUG:   ImageMagick:        Version: ImageMagick 6.9.6-6 Q16 x86_64 2016-11-25 http://www.imagemagick.org
DEBUG:   PhantomJS version:  2.1.1
DEBUG:   CasperJS version:   CasperJS not installed
DEBUG: #################################################
Config validated. No serious issues found.
ERROR: unable to find referenced imported config "spider_paths.yaml"
##############################################################
##############################################################
# This is an example configuration provided by Wraith.
# Feel free to amend for your own requirements.
# ---
# This particular config is intended to demonstrate how
# to use Wraith in 'capture' mode, which is best suited to
# comparing a test and live version of the same website.
#
# `wraith capture capture.yaml`
#
##############################################################
##############################################################

imports: "spider_paths.yaml"

# (required) The engine to run Wraith with. Examples: 'phantomjs', 'casperjs', 'slimerjs'
browser: "phantomjs"

# (required) The domains to take screenshots of.
domains:
  Local:  "http://www.chadhender.carlislefsp.com"
  Production:      "https://www.carlislefsp.com"

# (required) The paths to capture. All paths should exist for both of the domains specified above.
# paths:
#   home:     /
#   product-page:    /cash-and-carry/san-stackable-tumbler-shrink-wrap-packs/5506-807

# (required) Screen widths (and optional height) to resize the browser to before taking the screenshot.
screen_widths:
  - 320
  - 600x768
  - 768
  - 1024
  - 1280

# (optional) JavaScript file to execute before taking screenshot of every path. Default: nil
# before_capture: 'javascript/disable_javascript--phantom.js'

# (required) The directory that your screenshots will be stored in
directory: 'shots'

# (required) Amount of fuzz ImageMagick will use when comparing images. A higher fuzz makes the comparison less strict.
fuzz: '20%'

# (optional) The maximum acceptable level of difference (in %) between two images before Wraith reports a failure. Default: 0
threshold: 5

# (optional) Specify the template (and generated thumbnail sizes) for the gallery output.
gallery:
  template: 'slideshow_template' # Examples: 'basic_template' (default), 'slideshow_template'
  thumb_width:  200
  thumb_height: 200

# (optional) Choose which results are displayed in the gallery, and in what order. Default: alphanumeric
# Options:
#   alphanumeric - all paths (with or without a difference) are shown, sorted by path
#   diffs_first - all paths (with or without a difference) are shown, sorted by difference size (largest first)
#   diffs_only - only paths with a difference are shown, sorted by difference size (largest first)
# Note: different screen widths are always grouped together.
mode: diffs_first

verbose: true
mramitanand commented 7 years ago

The exact same error is happening for me and I am only running against current production domain. I tried using both the capture.yaml and the spider.yaml config files with same result.

gbeveridge commented 7 years ago

I am have this same issue with the default spider.yaml config files.

DEBUG: #################################################
DEBUG:   Command run:        spider configs/spider.yaml
DEBUG:   Wraith version:     4.0.0
DEBUG:   Ruby version:       ruby 2.3.3p222 (2016-11-21 revision 56859) [x86_64-darwin16]
DEBUG:   ImageMagick:        Version: ImageMagick 6.9.6-8 Q16 x86_64 2016-12-12 http://www.imagemagick.org
DEBUG:   PhantomJS version:  2.1.1
DEBUG:   CasperJS version:   1.1.2
DEBUG: #################################################
pcambra commented 7 years ago

Same here

wraith spider configs/spider.yaml 
DEBUG: #################################################
DEBUG:   Command run:        spider configs/spider.yaml
DEBUG:   Wraith version:     4.0.0
DEBUG:   Ruby version:       ruby 2.0.0p648 (2015-12-16 revision 53162) [universal.x86_64-darwin15]
DEBUG:   ImageMagick:        Version: ImageMagick 6.9.5-4 Q16 x86_64 2016-07-30 http://www.imagemagick.org
DEBUG:   PhantomJS version:  2.1.1
DEBUG:   CasperJS version:   CasperJS not installed
DEBUG: #################################################
Config validated. No serious issues found.
ERROR: unable to find referenced imported config "spider_paths.yaml"

It would appear that this commit was intended to fix this very same issue: https://github.com/BBC-News/wraith/commit/a8c968a830d15f60e0044819266700f18fb42a20

Adding a spider_paths.yaml file in the configs folder with the following content solves the issue:

paths:
  home: /
preda-bogdan commented 7 years ago

Thank you @pcambra It solved the issue for me.

mramitanand commented 7 years ago

Fails for me with following:

wraith spider spider.yaml

DEBUG: ################################################# DEBUG: Command run: spider spider.yaml DEBUG: Wraith version: 4.0.0 DEBUG: Ruby version: ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux] DEBUG: ImageMagick: Version: ImageMagick 6.7.7-10 2016-11-29 Q16 http://www.imagemagick.org DEBUG: PhantomJS version: 1.9.0 DEBUG: CasperJS version: CasperJS not installed DEBUG: ################################################# Config validated. No serious issues found. Crawling https://www.fdic.gov /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in =~': type mismatch: String given (TypeError) from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:inblock in skip_link?' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in each' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:inany?' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:298:in skip_link?' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:256:invisit_link?' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:151:in block in run' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:151:indelete_if' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:151:in run' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:92:inblock in crawl' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:83:in initialize' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:90:innew' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:90:in crawl' from /var/lib/gems/1.9.1/gems/anemone-0.7.2/lib/anemone/core.rb:18:incrawl' from /var/lib/gems/1.9.1/gems/wraith-4.0.0/lib/wraith/spider.rb:25:in crawl' from /var/lib/gems/1.9.1/gems/wraith-4.0.0/lib/wraith/cli.rb:45:inblock in spider' from /var/lib/gems/1.9.1/gems/wraith-4.0.0/lib/wraith/helpers/utilities.rb:4:in within_acceptable_limits' from /var/lib/gems/1.9.1/gems/wraith-4.0.0/lib/wraith/cli.rb:42:inspider' from /var/lib/gems/1.9.1/gems/thor-0.19.4/lib/thor/command.rb:27:in run' from /var/lib/gems/1.9.1/gems/thor-0.19.4/lib/thor/invocation.rb:126:ininvoke_command' from /var/lib/gems/1.9.1/gems/thor-0.19.4/lib/thor.rb:369:in dispatch' from /var/lib/gems/1.9.1/gems/thor-0.19.4/lib/thor/base.rb:444:instart' from /var/lib/gems/1.9.1/gems/wraith-4.0.0/bin/wraith:5:in <top (required)>' from /usr/local/bin/wraith:23:inload' from /usr/local/bin/wraith:23:in `

'

mramitanand commented 7 years ago

I commented this section out of spider.yaml and now it works for - future reference I noticed in issue #401 that this was similar issue:

spider_skips:

- /foo/bar.html # Matches /foo/bar.html explicitly

- !ruby/regexp /^\/baz\// # Matches any URLs that start with /baz

ChrisBAshton commented 7 years ago

This should now be fixed in v4.0.1.

micheal-cooper commented 7 years ago

@ChrisBAshton, I just installed version 4.0.1 using brew, but to get spider working, I had to add config/spider_paths.yml (thank you, @pcambra) and comment out the spider_skips (thank you, @mramitanand), so this issue does not seem to be fixed.