webrecorder / wabac.js

wabac.js - Web Archive Browsing Augmentation Client
https://replayweb.page
GNU Affero General Public License v3.0
96 stars 16 forks source link

When a JavaScript is UTF-8 encoded with a BOM (Byte Order Marker), the acorn parser in 'jsrewriter.js' throws an exception #154

Closed ARiedijk closed 5 months ago

ARiedijk commented 6 months ago

When a JavaScript is UTF-8 encoded with a BOM (Byte Order Marker), the acorn parser in 'jsrewriter.js' throws an exception.

The exception is 'acorn parsing failed on: const globalConst = "const"; let globalLet = "let"; let globalLetChanged = "x"; ...'

A possible solution is to remove this BOM from the text variable.

parseLetConstGlobals(text) {
const res = acorn.parse(this.removeBOM(text), {ecmaVersion: "latest"});
                                     ^^^^^^^^^^^^

Just for illustration, I have made a removeBOM function which does nothing more than replace '' with an empty string.

function removeBOM(text) {
 return text.replace(/^\uFEFF/g, "").replace(/^\u00EF?\u00BB\u00BF/g,"");
}

test = "\uFEFFThis is a test"
removeBOM(test) === "This is a test"

`

Without parsing javascript with utf-8 BOM your global let and const variables missing the self. addition which is done in the parseLetConstGlobals function of wabac

ikreymer commented 5 months ago

The approach in #160 should solve this more generally.