simonw / shot-scraper

A command-line utility for taking automated screenshots of websites
https://shot-scraper.datasette.io
Apache License 2.0
1.72k stars 78 forks source link

Ability to pass CLI options to `shot-scraper multi` #98

Open simonw opened 2 years ago

simonw commented 2 years ago

https://discord.com/channels/823971286308356157/1034259126076833873/1034303037914746942 @jefftriplett

I was curious about the multi subcommand and if it made sense to expose screenshot api options (width + height + quality) as cli options. it's not a deal breaker but it felt like it should be a pass-through since I might have a file of 50 urls and I want them to all use the same options. it's slightly more maintainable to keep track of one list of settings and it's nice because I can grab a list for the full width images and then do another for a different size but it's doable without

I like the idea that you can pass options to shot-scraper multi which will be used for YAML items that don't over-ride them.

simonw commented 2 years ago

Current multi options:

Options:
  -a, --auth FILENAME             Path to JSON authentication context file
  --retina                        Use device scale factor of 2
  --timeout INTEGER               Wait this many milliseconds before failing
  --fail-on-error                 Fail noisily on error
  -n, --no-clobber                Skip images that already exist
  -o, --output TEXT               Just take shots matching these output files
  -b, --browser [chromium|firefox|webkit|chrome|chrome-beta]
                                  Which browser to use
  --user-agent TEXT               User-Agent header to use
  --reduced-motion                Emulate 'prefers-reduced-motion' media feature
  --help                          Show this message and exit.
simonw commented 2 years ago

Here are the shot options:

Options:
  -a, --auth FILENAME             Path to JSON authentication context file
  -w, --width INTEGER             Width of browser window, defaults to 1280
  -h, --height INTEGER            Height of browser window and shot - defaults
                                  to the full height of the page
  -o, --output FILE
  -s, --selector TEXT             Take shot of first element matching this CSS
                                  selector
  --selector-all TEXT             Take shot of all elements matching this CSS
                                  selector
  --js-selector TEXT              Take shot of first element matching this JS
                                  (el) expression
  --js-selector-all TEXT          Take shot of all elements matching this JS
                                  (el) expression
  -p, --padding INTEGER           When using selectors, add this much padding in
                                  pixels
  -j, --javascript TEXT           Execute this JS prior to taking the shot
  --retina                        Use device scale factor of 2
  --quality INTEGER               Save as JPEG with this quality, e.g. 80
  --wait INTEGER                  Wait this many milliseconds before taking the
                                  screenshot
  --wait-for TEXT                 Wait until this JS expression returns true
  --timeout INTEGER               Wait this many milliseconds before failing
  -i, --interactive               Interact with the page in a browser before
                                  taking the shot
  --devtools                      Interact mode with developer tools
  --log-requests FILENAME         Log details of all requests to this file
  -b, --browser [chromium|firefox|webkit|chrome|chrome-beta]
                                  Which browser to use
  --user-agent TEXT               User-Agent header to use
  --reduced-motion                Emulate 'prefers-reduced-motion' media feature
  --help                          Show this message and exit.

Most of those would make sense as options to multi as well.

emteelb commented 1 year ago

Just discovered this tool and was going to create a new issue (still can if you would like) but then saw this issue.

Wanted to make a case for adding the ability to use --interactive with the multi command. This would be particularly helpful to people like me whose head begins to spin a bit when confronted with Javascript and CSS within a web-based GUI where a screen I might want to grab is of a drop-down menu inside of a dialog box that pops up after a button click.

For this, the --interactive command option works perfectly for a single screen grab. I'm just not savy enough yet to know what to put inside of selectors_all, js_selector, et. al. Or even when to use, e.g., javascript or js_selector, when it comes to a multi command YAML file.

What would be a really convenient feature for cave people like me is if there could be an interactive label (value?) that could be put inside a multi command YAML file for a particular URL or URLs. I could then use comment lines inside of the YAML file to indicate what actions to take after the interactive browser window pops up. Maybe not the way you might have intended the tool to be used... but could bridge a gap for less sophisticated users like me.

Anyway, thanks for writing and open sourcing this very exciting and already useful tool!