ben-sb / javascript-deobfuscator

General purpose JavaScript deobfuscator
https://deobfuscate.io
Apache License 2.0
786 stars 112 forks source link

The use of optional chaining operator (`?.`) breaks when parsing #16

Closed sgkoishi closed 2 years ago

sgkoishi commented 2 years ago
let a = null;
if (a?.toString() > 5) {
    console.log(1);
} else {
    console.log(2);
}

 Error: [line:cloumn]: Unexpected token "."
    at new JsError (path/javascript-deobfuscator/node_modules/shift-parser/dist/tokenizer.js:166:104)
    at GenericParser.createError (path/javascript-deobfuscator/node_modules/shift-parser/dist/tokenizer.js:297:14)
    at GenericParser.createUnexpected (path/javascript-deobfuscator/node_modules/shift-parser/dist/tokenizer.js:276:23)
    at GenericParser.parsePrimaryExpression (path/javascript-deobfuscator/node_modules/shift-parser/dist/parser.js:2012:22)
    at GenericParser.parseLeftHandSideExpression (path/javascript-deobfuscator/node_modules/shift-parser/dist/parser.js:1779:21)
    at GenericParser.parseUpdateExpression (path/javascript-deobfuscator/node_modules/shift-parser/dist/parser.js:1656:26)
    at GenericParser.parseUnaryExpression (path/javascript-deobfuscator/node_modules/shift-parser/dist/parser.js:1628:21)
    at GenericParser.parseExponentiationExpression (path/javascript-deobfuscator/node_modules/shift-parser/dist/parser.js:1598:23)
    at GenericParser.parseBinaryExpression (path/javascript-deobfuscator/node_modules/shift-parser/dist/parser.js:1547:23)
    at GenericParser.parseConditionalExpression (path/javascript-deobfuscator/node_modules/shift-parser/dist/parser.js:1494:23) {
  index: 1,
  line: 1,
  column: 1,
  parseErrorLine: 1,
  parseErrorColumn: 1,
  description: 'Unexpected token "."'
}

The same snippet is also not valid using https://shift-ast.org/parser.html

ben-sb commented 2 years ago

Yeah that's the downside of using the Shift parser. Unfortunately there isn't any good way to fix it aside from adding support for newer JS features to Shift (likely complicated and not worth it) or rewriting the project to use Babel (which I may do at some point, but don't have the time currently).

sgkoishi commented 2 years ago

Found the patterns in the sample are like ""?.["toLowerCase"]() - gonna simply remove all ?. here.

sgkoishi commented 2 years ago

Not a regression but some other string simplification with \r\n not working:

console.log(("b" + "a" + + "a" + "a").toLowerCase())
console.log("b" + "a" + ' ' + "a" + "a")
console.log("b" + "a" + '"' + "a" + "a")
console.log("\n" + "" + '"' + "a" + "a")

Result

console.log("baNaNa".toLowerCase());
console.log("ba aa");
console.log('ba"aa');
console.log("\n" + "" + '"' + "a" + "a");

Most other characters like \t or regular \a\b\c\d\e\f\g works correctly.

ben-sb commented 2 years ago

Newline and return characters should be fixed in 4f57a5c424ffc340883ceb4ef67ab8d3313278b3