chakra-core / ChakraCore

ChakraCore is an open source Javascript engine with a C API.
MIT License
9.07k stars 1.19k forks source link

Poor error message for unicode escapes in keywords #2872

Open dilijev opened 7 years ago

dilijev commented 7 years ago

Replacing the a in var with a unicode escape of ASCII a (\u0061) yields v\u0061r in place of this keyword.

## Source
print((function () { v\u0061r a = 1; return a; })())

┌─────────────┬──────────────────────────────────────────────────────────┐
│ d8          │                                                          │
│ node        │ SyntaxError: Keyword must not contain escaped characters │
├─────────────┼──────────────────────────────────────────────────────────┤
│ sm          │                                                          │
│             │ SyntaxError: var is an invalid identifier:               │
├─────────────┼──────────────────────────────────────────────────────────┤
│ node-ch     │                                                          │
│ ch-2.0      │ SyntaxError: Syntax error                                │
│ ch-master   │                                                          │
└─────────────┴──────────────────────────────────────────────────────────┘

IMO: SM's error is confusing, and our current error is completely useless. v8 has a good error here.

/cc @curtisman @bterlson


Some info to help investigation (line numbers as of commit 3656c42cd750fcd7883e10159a919309abed0046):

With the above repro, try putting a breakpoint in lib\Parser\kwd-swtch.h line 395 (code follows):

     case 'v':
        if (identifyKwds)

Step through from here and find out how the error is decided, and then determine how we would emit a better error message without hurting performance.

It appears we ultimately return JsErrorScriptCompile from lib\Jsrt\Jsrt.cpp line 2993:

        if (scriptFunction == nullptr)
        {
            PERFORM_JSRT_TTD_RECORD_ACTION_NOT_IMPLEMENTED(scriptContext);

            HandleScriptCompileError(scriptContext, &se);
            return JsErrorScriptCompile;
        }

Basically we know at that point that the script failed to compile, and that's all the information we have. We would need to return a more specific JsError to improve the error message.

mintunitish commented 5 years ago

Is this issue still open?

dilijev commented 5 years ago

@mintunitish This issue is still up for grabs if you'd like to work on it. @sharmasuraj0123 was looking into it but has been focused on other work items.

rhuanjl commented 5 years ago

My recent PR #5761 has changed the behaviour for this specific case but it could likely do with being better, the above case now prints:

SyntaxError: Unexpected invalid identifier 'var' after '{'

So it now correctly highlights that there is an invalid identifier BUT it doesn't point out why (i.e. the unicode escape).

The error for this case is created here: https://github.com/Microsoft/ChakraCore/blob/42ba7b92c5ad3db50af9a362fdda5f1ab776b583/lib/Parser/Parse.cpp#L3591-L3594

There may be other cases where an error arrises from an invalid identifier - if so they will likely still print an even less helpful message.

@mintunitish are you going to take a look at improving this? (If not I may look at it further)

euler16 commented 5 years ago

Hi, is anyone working on this issue? If not then can I work on this? I am getting started with open source , so might need some help on this

dilijev commented 5 years ago

AFAIK no one is working on this at the moment. Feel free!

euler16 commented 5 years ago

Thanks @dilijev. could you give me pointers to start?

rhuanjl commented 5 years ago

@euler16 I'm not a member of the team but as I've contributed on a related point recently here's a bit on where to start:

  1. Make sure you can build and test ChakraCore master locally before you start

  2. Make a Js file with this case in and run it with ch to confirm current output - check my comment above it's changed since the issue was opened, it would also be good to try using unicode escapes in other JS constructions and see what errors they make

  3. start looking at the C++ that produces this - I've pointed out the line the error message for this case comes from in my other comment above - you probably should look further at what input can give tokens of type tkNone also take a look at kwd-swtch.h - this is where the token type is set.

  4. once you've got that far hopefully you'll have an idea where to go, I'd be happy to offer more pointers at that stage if not.

euler16 commented 5 years ago

Thank you @rhuanjl .

dilijev commented 5 years ago

@sharmasuraj0123 Could you provide some links to source code that came up during your investigation?

sharmasuraj0123 commented 5 years ago

@euler16 I didn't get a whole lot of time to completely research into the issue, but it seems that @rhuanjl 's comments on this issue are accurate. Narrowing down the exactly where and why this specific error is thrown would be the right approach. Also look at how different error messages are being constructed and thrown. You may also want to look into rterrors.h - this is where the error message strings are stored

euler16 commented 5 years ago

@rhuanjl I didn't get what you meant by "Make a Js file with this case in and run it with ch to confirm current output". By 'ch' do you mean this ?

euler16 commented 5 years ago

Got it, you meant ChakraCore CLI! Sorry for that question. This is the first time I am working with ChakraCore. But I am not able to find instructions to install ch .

rhuanjl commented 5 years ago

If you build ChakraCore from source it will build ch.

On windows it will be in the build output folder as ch.exe on Linux or macOS it will be in the build output folder as just ch

Zee0227 commented 2 weeks ago

console.log((function () { var a = 1; return a; })()); To fix this issue, simply use the var keyword without any escape sequences, The above code hopefully correctly declares the variable a and returns its value without any syntax errors.