svaarala / duktape

Duktape - embeddable Javascript engine with a focus on portability and compact footprint
MIT License
5.94k stars 514 forks source link

Syntax errors include line number in message #460

Open fatcerberus opened 8 years ago

fatcerberus commented 8 years ago

This is just a minor cosmetic issue, but since there is ongoing work on error messages for 1.4.0 already, I figured now would be a good time to address it. A syntax error currently includes a line number as part of the error text:

SyntaxError: empty expression not allowed (line 30)

Since the SyntaxError object already includes the relevant filename and line number in its trace data or directly as own properties if those are disabled, this seems redundant.

svaarala commented 8 years ago

On the other hand sometimes people will just print a string coerced version of the error which only includes the name and the message (.fileName and .lineNumber are non-standard):

((o) Duktape 1.3.99 (v1.3.0-301-g9a26d9d)
duk> try { eval('foo=') } catch (e) { print(e); }
SyntaxError: parse error (line 1)

This is one reason why syntax error line and JSON error offset were added to the message, so that they'd be trivial to spot:

duk> JSON.parse('{"foo" "bar"}')
SyntaxError: invalid json (at offset 8)
    duk_bi_json.c:213
    parse  native strict preventsyield
    global input:1 preventsyield

These have both been requested explicitly in the past, to be contained directly within the error message (one liner).

I guess one could make the same argument for other kinds of errors, so current behavior is not maybe very consistent. Going more into that direction, it'd be relatively simple to either:

Both are non-compliant to some extent however:

The current behavior is compliant because Duktape can freely decide the messages of its own errors (syntax errors in this case).

fatcerberus commented 8 years ago

For what it's worth I think having the line number in JSON errors does make sense. For code though it's a little jarring since runtime errors don't include it.

On the other hand having the line number for syntax errors in eval() code does make sense, so... I guess keep the current behavior?

Whatever the case I don't think line numbers should be added to other errors. The properties are non-standard but available in most extant engines, so it's reasonable I think for a user to treat them as a given.

svaarala commented 8 years ago

Yeah, I'm not sure what the best behavior would be - there are competing concerns. Just quickly checking other engines for SyntaxError behavior:

V8 doesn't seem to blame syntaxerror on the source code (maybe I'm doing something wrong?):

> try { eval('\n\nfoo='); } catch (e) { console.log(e); console.log(e.fileName, e.lineNumber); console.log(e.stack); }
[SyntaxError: Unexpected end of input]
undefined undefined
SyntaxError: Unexpected end of input
    at repl:1:7
    at REPLServer.defaultEval (repl.js:132:27)
    at bound (domain.js:254:14)
    at REPLServer.runBound [as eval] (domain.js:267:12)
    at REPLServer.<anonymous> (repl.js:279:12)
    at REPLServer.emit (events.js:107:17)
    at REPLServer.Interface._onLine (readline.js:214:10)
    at REPLServer.Interface._line (readline.js:553:8)
    at REPLServer.Interface._ttyWrite (readline.js:830:14)
    at ReadStream.onkeypress (readline.js:109:10)

Rhino uses an interesting .fileName (no .stack):

js> try { eval('\n\nfoo='); } catch (e) { print(e); print(e.fileName, e.lineNumber); print(e.stack); }
SyntaxError: Unexpected end of file
<stdin>#2(eval) 3

I guess I'm inclined to keep the current behavior until a clearly better approach is evident :)

svaarala commented 8 years ago

JSON.parse() syntax error on V8:

> try { JSON.parse('{ "foo": 123') } catch (e) { console.log(e); console.log(e.fileName, e.lineNumber); console.log(e.stack); }
[SyntaxError: Unexpected end of input]
undefined undefined
SyntaxError: Unexpected end of input
    at Object.parse (native)
    at repl:1:12
    at REPLServer.defaultEval (repl.js:132:27)
    at bound (domain.js:254:14)
    at REPLServer.runBound [as eval] (domain.js:267:12)
    at REPLServer.<anonymous> (repl.js:279:12)
    at REPLServer.emit (events.js:107:17)
    at REPLServer.Interface._onLine (readline.js:214:10)
    at REPLServer.Interface._line (readline.js:553:8)
    at REPLServer.Interface._ttyWrite (readline.js:830:14)

Rhino:

js> try { JSON.parse('{ "foo": 123') } catch (e) { print(e); print(e.fileName, e.lineNumber); print(e.stack); }
SyntaxError: Unterminated object literal
 0
fatcerberus commented 8 years ago

By the way I found a bug related to this - when syntax errors are rethrown (I think...) the line number is stripped. I discovered this by deliberately causing an error in an eval() and letting the debugger intercept it - the Throw notify is missing the line number in the message.

svaarala commented 8 years ago

I think what happens is that the Throw notify shows the Error before the line number is added to the message. Right now that happens in the compiler: it catches the error (after it's initially thrown), appends the line number, and rethrows it.

Now that the error augmentation process understands syntax errors, I could move that into the augmentation code, so it'd happen before the throw notify. It would also be simpler.

svaarala commented 8 years ago

Ok, so #462 should have that change and you should see the modified message in a Throw notify.

fatcerberus commented 8 years ago

Somewhat related from a code-structure standpoint: Compilation errors can be rethrown multiple times on their way out of the compiler, leading to sometimes 3-4 identical "fatal" Throw notifys in a row. How hard would that be to fix so that only the last one halts execution?

svaarala commented 8 years ago

I guess I see catching and rethrowing an ordinary part of error handling (like we discussed re: the Throw notify) so it's very difficult for the Throw notify to provide a meaningful indication of whether an error will be ultimately caught or not - what does that mean if there are e.g. conditional rethrows along the catch stack?

One would need to classify all catch sites, both C and Ecmascript, as either truly catching the error or just "rethrowing" the error (or a related error). Ultimately that information would need to be given by the code setting up the catch point because it depends on what the catching code really does. For C code it'd be possible to provide such a flag when doing a protected call, but Ecmascript catchers would need some custom syntax which seems like an overkill.

Given all that, I think trying to fix any one case might be possible but the problem will never go away because the "will not be caught permanently" (or something similar) information is simply not always available.

But regading the compilation errors: eliminating all rethrow sites would be fixable by moving all the stuff those rethrow sites do either into the original throw site or into the shared error throw handling. Both would at least initially seem awkward from a modularity standpoint. I'm also not very excited about reducing rethrows here and there because they're useful for modularity and the process will never be complete because the problem is unsolvable :-)

Another approach entirely would be to avoid trying to make that "will this be caught" judgement call, and instead provide more information about the active catchers to the debug client so that it could make that call with whatever heuristics are useful. There isn't always much information about a catch site though if it's in C code, so I'm not sure how well that would work.

fatcerberus commented 8 years ago

The root of the problem in this case is that protected call sites don't count as a catch clause, which is the criteria used to determine whether the error is fatal.

I didn't mean to suggest restructuring the whole compiler to minimize rethrows, that would be insane. And I certainly appreciate modularity. :) A good general-purpose fix I think would be to add a mechanism where certain protected sections could be marked as "catching" so that the debugger doesn't intercept them. You'd still get 3-4 Throw notifys this way, but only the final one, the one that propagates out of the compiler, would cause execution to stop,

fatcerberus commented 8 years ago

To clarify, the complaint in this case isn't about the duplicate notifys - the same would be true even in Ecmascript code with catch-and-rethrow logic. I'm fine with that. The issue is that, unlike with an Ecmascript catch, the compiler-internal catches aren't seen by the Throw notify check and so it is sometimes necessary to send several Resume commands to power through all the rethrows.

As mentioned in a code comment, the fact that protected calls are ignored by the catch check is usually desirable, but in certain cases (like here), it would be useful to optionally be able to make a protected call which is considered "catching".

svaarala commented 8 years ago

Ah ok, so the problem is that execution pauses on a supposedly uncaught error multiple times.

We could add an internal (?) duk_rethrow() which would indicate that you're conceptually rethrowing a caught exception and such a rethrow shouldn't be caught. That would make the error propagate to an actual error site without throw notify / pause handling.

svaarala commented 8 years ago

Oops, "shouldn't be caught" above should be "shouldn't cause pausing".

fatcerberus commented 8 years ago

That would probably work. We could also use that when rethrowing an error at the end of a finally clause, to prevent a second pause on the same (uncaught) error.