Open kinnison opened 9 years ago
This is one of those real world regexp leniency issues: the issue is probably [^\2]
which is not allowed in standard Ecmascript:
\0
evaluates to a NUL character, while all non-zero decimal escapes evaluate to an integer (E5.1 Section 15.10.2.11).SyntaxError
if the result is not a character (E5.1 Section 15.10.2.19). In other words, anything other than \0
evaluates to an integer which causes a SyntaxError
inside a class.Longer term I don't think it's feasible for Duktape's regexp engine to simultaneously be low footprint and support non-standard regexp idioms, so the two alternatives are:
I'm planning to start with the second approach, as soon as I get some time to work on it :-)
With the number of real world regexps I've seen Duktape reject, it actually amazes me that something like NetSurf is feasible. It makes me think the leniency issue isn't as big a deal as it seems at first.
@fatcerberus As it stands, our efforts are not resulting in something compatible with what's out there; but we're working hard to make it go. As @svaarala builds more and more possibility into duktape, so our efforts go further :-)
I was about to open a new issue, but I think it's better to just comment on this one.
I have a project built with WebPack that uses the XRegExp npm package. Duktape had problems with these two regular expressions:
/\[(\^?)]/
and
/(\()(?!\?)|\\([1-9]\d*)|\\[\s\S]|\[(?:[^\\\]]|\\[\s\S])*]/g
In both cases the issue is that the regex wants to match a literal ]
, but it doesn't escape it. So technically the regex is invalid.
When I change the first one to
/\[(\^?)\]/
and the second one to
/(\()(?!\?)|\\([1-9]\d*)|\\[\s\S]|\[(?:[^\\\]]|\\[\s\S])*\]/g
Duktape is happy :)
@jfahrenkrug Support for literal curlies was added a while back - I could look into adding support for literal brackets too. They don't appear in unescaped form as commonly as the curly braces but still pop up from time to time :)
@jfahrenkrug #871; the [
matching is more complicated because it needs actual lookahead, but accepting a literal closing bracket ]
should be trivial.
@svaarala That's great news, thank you! Meanwhile I've opened a PR for XRegExp: https://github.com/slevithan/xregexp/pull/141
Hi
when we use the selfDefending on javascript-obfuscator the regex chokes also
function outputMyNumber(showThis) { console.log(showThis); } function hello() { var getNumber = function (b) {return b+ 3; } var thenumber = getNumber(2); outputMyNumber(thenumber); } hello()
obfuscator: { options: { compact: true, controlFlowFlattening: false, deadCodeInjection: false, debugProtection: false, debugProtectionInterval: false, disableConsoleOutput: false, identifierNamesGenerator: 'hexadecimal', log: true, renameGlobals: false, rotateStringArray: true, selfDefending: true, stringArray: true, stringArrayEncoding: 'base64', stringArrayThreshold: 0.75, unicodeEscapeSequence: false, target: 'browser' },
function outputMyNumber(_0x393fc5){console'log';}function hello(){var _0x246e30=function(){var _0xb2ee37=!![];return function(_0x2a1c07,_0x39a818){var _0x47081b=_0xb2ee37?function(){if(_0x39a818){var _0x461e24=_0x39a818'apply';_0x39a818=null;return _0x461e24;}}:function(){};_0xb2ee37=![];return _0x47081b;};}();var _0x78fdcf=function(_0x408fe3){var _0x1a1f63=_0x246e30(this,function(){var _0x49075f=function(){return'\x64\x65\x76';},_0x57969d=function(){return'\x77\x69\x6e\x64\x6f\x77';};var _0x23f862=function(){var _0x45fb57=new RegExp('\w+\s(){\w+\s['|"].+['|"];?\s}');return!_0x45fb57'\x74\x65\x73\x74';};var _0x532ab5=function(){var _0x130be7=new RegExp('\x28\x5c\x5c\x5b\x78\x7c\x75\x5d\x28\x5c\x77\x29\x7b\x32\x2c\x34\x7d\x29\x2b');return _0x130be7'\x74\x65\x73\x74';};var _0x4e7fde=function(_0x47bb06){var _0x35aeea=-0x1>>0x1+0xff%0x0;if(_0x47bb06'\x69\x6e\x64\x65\x78\x4f\x66'){_0x54673c(_0x47bb06);}};var _0x54673c=function(_0x192be0){var _0x2eb20c=-0x4>>0x1+0xff%0x0;if(_0x192be0'\x69\x6e\x64\x65\x78\x4f\x66'!==_0x2eb20c){_0x4e7fde(_0x192be0);}};if(!_0x23f862()){if(!_0x532ab5()){_0x4e7fde('\x69\x6e\x64\u0435\x78\x4f\x66');}else{_0x4e7fde('\x69\x6e\x64\x65\x78\x4f\x66');}}else{_0x4e7fde('\x69\x6e\x64\u0435\x78\x4f\x66');}});_0x1a1f63();return _0x408fe3+0x3;};var _0x333e8b=_0x78fdcf(0x2);outputMyNumber(_0x333e8b);}hello();
Embedded in some JS which deals with CSS, we have encountered the following regular expression:
Chrome accepts this happily, but duktape complains of an invalid decimal escape.
This prevents some sites from loading their javascript frameworks.
Any help would be gratefully received.