gildas-lormeau / single-file-cli

CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
GNU Affero General Public License v3.0
540 stars 58 forks source link

Stacktrace when running SingleFile CLI for a particular website #12

Closed andrewdbate closed 1 year ago

andrewdbate commented 1 year ago

I installed SingleFile from the Docker image in the usual way:

docker pull capsulecode/singlefile
docker tag capsulecode/singlefile singlefile

I can then save a webpage as usual and everything works as expected:

docker run -v $(pwd):/usr/src/app/out singlefile "https://www.wikipedia.org" --dump-content=false

However, when I try to save the webpage https://cdn.docbook.org/release/xsl-nons/1.79.2/webhelp/docs/ch03s02.html I get a stacktrace:

$ docker run -v $(pwd):/usr/src/app/out singlefile "https://cdn.docbook.org/release/xsl-nons/1.79.2/webhelp/docs/ch03s02.html" --dump-content=false
Evaluation failed: SyntaxError: Invalid regular expression: /^var(--/: Unterminated group
    at String.match (<anonymous>)
    at String.startsWith (https://cdn.docbook.org/release/xsl-nons/1.79.2/webhelp/docs/search/nwSearchFnt.js:871:15)
    at <anonymous>:1:217815
    at Array.find (<anonymous>)
    at <anonymous>:1:217804
    at Array.find (<anonymous>)
    at tp (<anonymous>:1:217793)
    at Object.removeUnusedFonts (<anonymous>:1:300431)
    at jm.removeUnusedFonts (<anonymous>:1:267447)
    at Nm.executeTask (<anonymous>:1:243712) URL: https://cdn.docbook.org/release/xsl-nons/1.79.2/webhelp/docs/ch03s02.html
Stack: Error: Evaluation failed: SyntaxError: Invalid regular expression: /^var(--/: Unterminated group
    at String.match (<anonymous>)
    at String.startsWith (https://cdn.docbook.org/release/xsl-nons/1.79.2/webhelp/docs/search/nwSearchFnt.js:871:15)
    at <anonymous>:1:217815
    at Array.find (<anonymous>)
    at <anonymous>:1:217804
    at Array.find (<anonymous>)
    at tp (<anonymous>:1:217793)
    at Object.removeUnusedFonts (<anonymous>:1:300431)
    at jm.removeUnusedFonts (<anonymous>:1:267447)
    at Nm.executeTask (<anonymous>:1:243712)
    at ExecutionContext._evaluateInternal (/usr/src/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/ExecutionContext.js:221:19)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async ExecutionContext.evaluate (/usr/src/app/node_modules/puppeteer-core/lib/cjs/puppeteer/common/ExecutionContext.js:110:16)
    at async getPageData (/usr/src/app/node_modules/single-file-cli/back-ends/puppeteer.js:139:10)
    at async Object.exports.getPageData (/usr/src/app/node_modules/single-file-cli/back-ends/puppeteer.js:51:10)
    at async capturePage (/usr/src/app/node_modules/single-file-cli/single-file-cli-api.js:254:20)
    at async runNextTask (/usr/src/app/node_modules/single-file-cli/single-file-cli-api.js:175:20)
    at async Promise.all (index 0)
    at async capture (/usr/src/app/node_modules/single-file-cli/single-file-cli-api.js:126:2)
    at async run (/usr/src/app/node_modules/single-file-cli/single-file:54:2)

I do not get an error when I try to save the same page using the SingleFile extension for Firefox. Thanks!

gildas-lormeau commented 1 year ago

Thank you, I confirm I was able to reproduce the issue. It's due to the fact that some code in nwSearchFnt.js from the page overrides String#startsWith with a buggy implementation. To circumvent the issue, I now use String#match where it was throwing an error and it seems to be OK. Note that this bug does not happen in extensions because scripts run in a separate context where APIs can't be overridden by the page.