simonw / shot-scraper

A command-line utility for taking automated screenshots of websites
https://shot-scraper.datasette.io
Apache License 2.0
1.67k stars 73 forks source link

Add option to bypass Content-Security-Policy when executing Javascript #116

Closed sesh closed 11 months ago

sesh commented 1 year ago

Refs: #114

Adds a --bypass-csp option to the commands that allow Javascript to be executed.

The additional test case that has been added loads Github and attempts to load an external module. With the --bypass-csp flag this will work. You can execute the following on the current version of shot-scraper to see it failing:

shot-scraper javascript github.com "async () => { await import('https://cdn.jsdelivr.net/npm/left-pad/+esm'); return 'content-security-policy ignored' }"

The above will continue to fail with this change until --bypass-csp is added.

I have added the flag to the documentation by have not added a new documentation block to the Javascript page for this yet. I'm happy to write up an example if you're keen to accept this PR.

I'm also interested in feedback in how the help text should be phrased. I went with the simplest possible phrasing but it does assume that the user knows what a CSP is.


:books: Documentation preview :books:: https://shot-scraper--116.org.readthedocs.build/en/116/

simonw commented 11 months ago

Thanks for this - I'm going to land it as-is and then update the documentation. It's a really good implementation.

simonw commented 11 months ago

Manually tested this like so. First, without the flag:

shot-scraper javascript github.com "
  async () => {
    await import('https://cdn.jsdelivr.net/npm/left-pad/+esm');
    return 'content-security-policy ignored' }
"

Error: TypeError: Failed to fetch dynamically imported module: https://cdn.jsdelivr.net/npm/left-pad/+esm

Then with the flag:

shot-scraper javascript github.com "
  async () => {
    await import('https://cdn.jsdelivr.net/npm/left-pad/+esm');
    return 'content-security-policy ignored' }
" --bypass-csp

"content-security-policy ignored"

simonw commented 11 months ago

Documentation: https://shot-scraper.datasette.io/en/stable/javascript.html#bypassing-content-security-policy-headers

jamesking commented 11 months ago

@sesh @simonw thank you for this!