chakra-core / ChakraCore

ChakraCore is an open source Javascript engine with a C API.
MIT License
9.12k stars 1.2k forks source link

Spec deviation in RegExp.prototype[@@search]() #6612

Open MadProbe opened 3 years ago

MadProbe commented 3 years ago

Version

1.12.0-master

Flags

-ES6Experimental

Test code

{
    let i = 0;
    const proxy = new Proxy({ exec() { return null }, lastIndex: 0 }, {
        set(target, p) {
            print(`Should not be called`);
            print(`Property:`, p, `value:`, value);
            i++;
        }
    })
    RegExp.prototype[Symbol.search].call(proxy);
    print(i);
}

Excepted result

0

Actual result

Should not be called
Property: lastIndex value: 0
Should not be called
Property: lastIndex value: 0
2

Possible solution

It seems like that JavascriptRegExp::EntrySymbolSearch doesn't perform some checks needed by spec. https://github.com/chakra-core/ChakraCore/blob/e6c8f78f505cb332538425b7a9309aee217869ef/lib/Runtime/Library/JavascriptRegularExpression.cpp#L812-L819 Possibly i will open a pr after #6610 is merged and it will also fix #5388

rhuanjl commented 3 years ago

I've been looking at totally replacing our implementation of RegExp prototype methods - frankly I think they're all a mess.

There are 4 interrelated problems:

  1. The underlying RegExp matcher has never been updated to properly support unicode mode matching - so with the u flag lone surrogates get inappropriately matched.
  2. the ES6 reform to the RegExp prototype is behind a flag and not entirely finished flag is -ES6RegexPrototypeProperties
  3. the ES6 RegExp symbols are behind a flag and not entirely finished flag is -ES6RegexSymbols
  4. The original ES5 RegExp system was heavily optimised in various ways that make the logic kinda hard to follow - also various actions have jit fast paths so you can think you've fixed a spec deviation only to find that when the code is jitted it breaks from spec again.

I opened issue #6390 to track all work to solve these points.

But I found the more I looked at it the bigger the issue it was. I was considering re-implementing all of the RegExp methods as self-hosted javascript to make them simpler to maintain (this is probably still my preference) so we can just replace all of the mess but it's slipped down my priorities list - if you want to give it a stab feel free (and feel free to ask me for any help you need) but just be warned it's big and messy.