we-like-parsers / cpython

Here we work on integrating pegen into CPython; use branch 'pegen'

Other

1 stars 0 forks source link

Labeled Exceptions for pegen - Better error-reporting in CPython #123

Closed lysnikolaou closed 4 years ago

lysnikolaou commented 4 years ago

This proposal is about implementing labeled exceptions as a mechanism for better SyntaxError reporting in CPython. I first heard about labeled exception by @pablogsal in https://github.com/we-like-parsers/cpython/issues/43#issuecomment-610116884, where he includes a link to https://arxiv.org/pdf/1405.6646.pdf, which is a paper that completely describes these features.

Labeled exceptions in Pegen

Labeled exceptions are the closest thing to an exception-handling mechanism that a PEG parser can support. At the moment, a rule can either succeed (the input can be parsed) or fail (the input can either not be parsed or there was an error in helper code). The labeled exceptions mechanism introduces some grammar constructs, which allow to fail with a certain label, that can later be used to denote what kind of failure caused the parser to abort and output a corresponding error message, thus giving a better description of the error to the user.

In pegen we could introduce an additional C struct that would be part of the parser state, similar to the perrdetail struct, that the current parser uses. It would look something like this:

typedef struct {
    int error_set;
    int label;
    int lineno;
    int offset;
    int token;
    int expected;
} PegenErrorDetails;

This struct could later be used to find out what error occurred. For this, a mapping between labels and error messages would need to be implemented as well, probably in the form of an array of LabelMessageMapping structs. BUT, first things first!

New grammar syntax

New `throw` operator

A new throw operator (spelled '^') has to be implemented, which accepts a label as a parameter and causes the rule to fail, while setting an exception and the corresponding label. An example in python.gram could be:

statement[asdl_seq*]:
    | a=compound_stmt { _PyPegen_singleton_seq(p, a) }
    | simple_stmt
    | ^ InvalidStatement

If the parser were to exit with the InvalidStatement label set, it would denote that a statement could not be parsed, but the error was not caught by one of the rules in the call tree of either compound_stmt_rule or simple_stmt_rule. This could then be used (together with the location information) for the error message.

Parametrized `bar` operator

We also need to introduce a new parametrized bar operator (spelled | ^ (label1, label2) alt). This accepts a set of labels as input. If the parametrized bar operator is used, the alternative following it should only be tried, in case one of the prior alternatives failed with one of the listed labels. For example:

assignment:
    | a=target b=augassign c=(yield_expr | star_expressions) {
         _Py_AugAssign(a, b->kind, c, EXTRA) }
    | ^ (InvalidTarget) invalid_assignment

target[expr_ty] (memo):
    | a=t_primary '.' b=NAME !t_lookahead { _Py_Attribute(a, b->v.Name.id, Store, EXTRA) }
    | a=t_primary '[' b=slices ']' !t_lookahead { _Py_Subscript(a, b, Store, EXTRA) }
    | t_atom
    | ^ InvalidTarget

Here, we know that the rule invalid_assignment only checks for invalid targets and thus should only be tried out, in case the first alternative of assignment failed with the InvalidTarget label. Other possible labels could be InvalidAugassign, in case the target was parsed, but the operator following it wasn't a valid augassign operator.

Parametrized terminal nodes

Terminal nodes should optionally accept a label as an argument, which is the label that needs to be set, in case the terminal node could not be parsed. Take the following rule for the if statements for example:

if_stmt[stmt_ty]:
    | 'if' a=named_expression ':'^NoColonAfterIf b=block c=elif_stmt { _Py_If(a, b, CHECK(_PyPegen_singleton_seq(p, c)), EXTRA) }
    | 'if' a=named_expression ':'^NoColorAfterIf b=block c=[else_block] { _Py_If(a, b, c, EXTRA) }

In case a colon fails to be parsed after an if and a named_expression have both been successfully parsed, it should raise an error and set the NoColorAfterIf label, which would need to be associated with an error message like SyntaxError: expected ':' after an if expression.

Implementation details

I think that the easiest way to do all of this is to implement code that raises exceptions in Parser/pegen/pegen.c, which would be called by the parser for the throw operator and when a terminal node cannot be parsed. The helper code would need to:

Compute which token failed, in case the error was a parsing failure (and the corresponding location)
Find out what token was expected at this position
Set p->errors.error_set to 1.
Set p->errors.label to the correct label (which would probably be a parameter to the function)

Later, if the parser exits with a NULL node, the helper code in _PyPegen_run_parser will check to see if p->errors.error_set is equal to 1 and, if it is, it will call code that generates the correct SyntaxError.

Also needed is a helper function that checks to see if p->errors.label is one of the labels in a given set. The parser would call this function for the parametrised bar operator, in order to determine whether it should go on with the current alternative or not.

Conclusion and things to consider

I think that if we implement all this, we should be able to make the new PEG-based parser much better in reporting correct (location-wise) and user-friendly error messages. But there is still one thing to consider. Even if all proves to be successful, I don't think this could ever be better than a hand-written parser, which is essentially what Python/ast.c is. There is going to be some error messages we cannot reproduce using these mechanisms, so we will soon have to answer this: Do we want to write additional helper code to check for corner cases and reproduce those error messages or do we want to go with the generic SyntaxError: invalid syntax message, in case we cannot do this in the grammar itself?

gvanrossum commented 4 years ago

Hi @lysnikolaou, I've slept on it and I think I understand better now. I'm changing terminology on you to make things clearer to myself (and hopefully to other Pythonistas not familiar with the terminology from the paper).

We introduce a new class of objects called exceptions (you called them labels) that can be thrown and caught during the parsing process. A rule can catch an exception that was thrown by any of its alternatives. An exception that isn't caught is propagated out (like exceptions in Python). By convention, exceptions are identifiers starting with a capital letter.
The throw operator -- spelled | ^Exception -- must be the last alternative of its rule. It makes the rule fail as usual, and this can be recovered by a calling rule finding another alternative that works (presumably?), or by a calling rule with a matching catch operator:
The catch operator -- spelled | ^(Exception, ...) error_rule -- must also be the last alternative of its rule (not sure?). When it is reached and the most recently thrown exception matches one of the given exceptions, it will attempt to match error_rule which could produce some kind of error message as its action (or it could fail to match anything; what happens then?). I suppose in the general case there could be multiple catch operators on a single rule, to catch different exceptions with different error_rules.
The "parameterized terminal node" seems to be a shorthand for a throw operator. In particular, I believe that dadada ':'^Exception bababa is a shorthand for dadada (':' | ^Exception) bababa which is just an anonymous rule whose last alternative is a throw. I'm not sure how important this is, or why it should only be allowed for terminal nodes. Anyway, its semantics seem to follow from this equivalence.
If an exception is thrown but never caught, the top-level code of the parser should emit an error message specific to the thrown exception. But if a exception is caught successfully (meaning the error_rule after the catch operator that caught it succeeded) I suppose the exception is cleared.
In the future this mechanism could be extended to include error recovery, where an error_rule produces an error message but still succeeds, and parsing continues after whatever the error_rule matched. The error message and location are then appended to a list of errors, and at the end we can report all errors together, like "real" compilers do.

How much does this match your proposal? It seems I have some open questions regarding whether a thrown exception can by cleared by a successful match (without a catch operator) at an outer level. I also am not sure whether it ever makes sense to have a regular alternative following a catch operator.

Finally, I worry that the proposed spelling of the operators may be confusing. We use ^ as the cut operator already, and while cut cannot occur at the start of an alternative, it still feels confusing to overload it so heavily. The "parameterized terminal node" seems an outright ambiguity, since I can't see how the metagrammar could distinguish dadada ^ foo bababa (where foo is an exception) from dadada ^ foo bababa where foo is another rule name.

lysnikolaou commented 4 years ago

Hey hey @gvanrossum,

first of all thanks for taking the time to go through this and making all the improvements in terminology. Indeed, most of the operators are easier to digest using your names.

Most of what you say is right, except for a few cases. Let me go through all the points, where my understanding of things differs from yours:

The catch operator -- spelled | ^(Exception, ...) error_rule -- must also be the last alternative of its rule (not sure?). When it is reached and the most recently thrown exception matches one of the given exceptions, it will attempt to match error_rule which could produce some kind of error message as its action (or it could fail to match anything; what happens then?). I suppose in the general case there could be multiple catch operators on a single rule, to catch different exceptions with different error_rules.

The analogy to a catch is right. I feel that there are some things that need further explanation though. Most importantly, this does not need to be used for error_rules only. It can also be used for alternatives that only need to be expanded, in case some specific exception was thrown. This also means that the catch operator doesn't need to be used in the last alternative. In fact, there can also be multiple catch operators in a rule. I feel that I unintentionally "trapped" you in believing that this can only be used with error_rules with my example in the first post, so let me try to give a better one:

First of all, remember that the rule for import_from targets is the following:

import_from_targets[asdl_seq*]:
    | '(' import_from_as_names [','] ')'
    | import_from_as_names
    | '*'

Let me re-order the alternatives (for the purposes of this example):

import_from_targets[asdl_seq*]:
    | import_from_as_names
    | '(' import_from_as_names [','] ')'
    | '*'

Now we annotate the import_from_as_names rule and all its children with exceptions. The grammar looks like this:

import_from_as_names[asdl_seq*]:
    | ','.import_from_as_name+
import_from_as_name[alias_ty]:
    | (NAME | ^InvalidImportFromTarget) ['as' (NAME | ^InvalidImportFromAlias)]

Going back to the import_from_targets rule we would need to catch the exception InvalidImportFromTarget, so that we expand the later alternatives, but not the exception InvalidImportFromAlias, because that would mean that the other alternatives will always fail, due to the first NAME and the as keyword successfully parsing. Thus we would need to do something like this:

import_from_targets[asdl_seq*]:
    | import_from_as_names
    | ^(InvalidImportFromTarget) '(' import_from_as_names [','] ')'
    | ^(InvalidImportFromTarget) '*'

I hope this example makes the whole thing a bit more clear.

We use ^ as the cut operator already, and while cut cannot occur at the start of an alternative, it still feels confusing to overload it so heavily.

I'm not sure if I understand you correctly, but we currently use ~ for the cut operator and not ^, so I don't think there's a problem there.

Some more remarks:

I believe that dadada ':'^Exception bababa is a shorthand for dadada (':' | ^Exception) bababa which is just an anonymous rule whose last alternative is a throw. I'm not sure how important this is, or why it should only be allowed for terminal nodes.

Exactly right! terminal^InvalidTerminal is just syntactic sugar for (terminal | ^InvalidTerminal). The idea is to only allow it for terminal nodes, because rules can use the throw operator as their last alternative. What I mean is that we don't really need this:

A: B C^Label
B: 'a'
C: 'b'

because we can always do this, which would be the same thing:

A: B C
B: 'a'
C: 'b' | ^Label

Anyway, I don't think this is particularly important, so we can just disregard it altogether.

In the future this mechanism could be extended to include error recovery, where an error_rule produces an error message but still succeeds, and parsing continues after whatever the error_rule matched. The error message and location are then appended to a list of errors, and at the end we can report all errors together, like "real" compilers do.

That is a great idea! We can indeed use these mechanisms to implement error recovery as well.

gvanrossum commented 4 years ago

Whoops, you're right, I misremembered our cut notation. I don't know what I was thinking.

Your example has clarified things a lot, but now I have new questions.

It still seems somewhat confusing that the same character (^) is used for both throw and catch.
It looks like the example doesn't really need the exception machinery, because it works just fine without it. Would adding exceptions like you suggest improve the user experience? I could also imagine just adding cut operators to the example, e.g. after 'as' and after '('.
If we were to add the throw operators as you indicate in the example but not the catch operators, what would happen? If the first alternative (| import_from_as_names) threw an exception, would the next alternative (| '(' import_from_as_names [','] ')') still be tried? Or does throw act similar to cut here, causing the subsequent non-catch alternatives to be skipped altogether? If the latter, I would presume that the exception "bubbles up" out of the rule that didn't catch it, and it would keep bubbling up without trying any further non-catch alternatives until it reaches a rule with a matching catch, or until it exits from the start rule. Right?
If the second alternative (| ^(InvalidImportFromTarget) '(' import_from_as_names [','] ')') is taken but that throws InvalidImportFromTarget from import_from_as_names, does the next alternative (|^(InvalidImportFromTarget) '*') get tried (because it catches the same exception)? This would seem wrong for this example, since at that point we've already parsed ( so there is no way that * can match.
I wonder if we can show the semantics more clearly using Python code equivalent to the parsing rules (like I did in class ToyParser in my blog post, but for your import_from_targets example), by manually adding try/except blocks representing the throw and catch operators?

Fascinating stuff!

lysnikolaou commented 4 years ago

You're raising some very interesting points that are mostly up for discussion.

It still seems somewhat confusing that the same character (^) is used for both throw and catch.

I did some thinking about this, before writing up the proposal. I also considered using < for throw, because in western left-to-right reading it to the back, thus to the top (analogous to why I chose ^), and > for catch, bacause it's the opposite of <. For some reason, which I can't really explain, I went with the same operator for both, but we could certainly revisit this part of the proposal.

It looks like the example doesn't really need the exception machinery, because it works just fine without it. Would adding exceptions like you suggest improve the user experience? I could also imagine just adding cut operators to the example, e.g. after 'as' and after '('.

The idea was that we could catch the InvalidImportFromAlias in the top-level code and output a better error message for this, instead of the generic invalid syntax that is currently shown. Maybe something like:

>>> from a import b as c.d
>>> from a import b as c.d
  File "<stdin>", line 1
    from a import b as c.d
                        ^
SyntaxError: invalid alias for import target

If we were to add the throw operators as you indicate in the example but not the catch operators, what would happen? If the first alternative (| import_from_as_names) threw an exception, would the next alternative (| '(' import_from_as_names [','] ')') still be tried? Or does throw act similar to cut here, causing the subsequent non-catch alternatives to be skipped altogether? If the latter, I would presume that the exception "bubbles up" out of the rule that didn't catch it, and it would keep bubbling up without trying any further non-catch alternatives until it reaches a rule with a matching catch, or until it exits from the start rule. Right?

That's a very very good question, which I haven't given much though, though I should. I'd really like to hear your opinions on this and also @pablogsal's. The way I see it, we really ought to go with the first of your alternatives (alternative with no caught exceptions is still being tried out). This seems like the most non-intrusive thing to do. The | operator will just continue to work like it does now. It won't care, if an exception is set or not. It will just try to parse the alternative and fail or succeed based on that. I guess we could say that this is some kind of catch-everything but do not clear the exception. I feel that, if we were to go with the second alternative, we would need to really poison the grammar with catch operators all over the place, while the other one allows us to only use it wherever we need to.

If the second alternative (| ^(InvalidImportFromTarget) '(' import_from_as_names [','] ')') is taken but that throws InvalidImportFromTarget from import_from_as_names, does the next alternative ( |^(InvalidImportFromTarget) '') get tried (because it catches the same exception)? This would seem wrong for this example, since at that point we've already parsed ( so there is no way that can match.

Good point! We would have to use exceptions together with the cut operator in this case, in order to completely skip the third alternative:

import_from_targets[asdl_seq*]:
    | import_from_as_names
    | ^(InvalidImportFromTarget) '(' ~ import_from_as_names [','] ')'
    | ^(InvalidImportFromTarget) '*'

I wonder if we can show the semantics more clearly using Python code equivalent to the parsing rules (like I did in class ToyParser in my blog post, but for your import_from_targets example), by manually adding try/except blocks representing the throw and catch operators?

Great idea! These blog posts and the way you were presenting PEG features using Python code made it easy to follow and understand everything. I could try and do something like that, but probably not before early next week. I could even try to write a small blog post for a change, which I haven't done and seems like a nice challenge.

Fascinating stuff!

Indeed! I'm really excited about this!

lysnikolaou commented 4 years ago

I've been thinking more and more about this and I came to realise that allowing throw operators to exist in any alternative and even after other atoms, that need to be parsed for the throw to be reached, would actually be useful. It would very closely resemble the way we currently report errors, but with the very important distinction, that we would be able to catch and unset the exception.

Let's consider the following rule we use to distinguish valid and invalid targets:

star_target[expr_ty] (memo):
    | '*' a=(!'*' star_target) {
        _Py_Starred(CHECK(_PyPegen_set_expr_context(p, a, Store)), Store, EXTRA) }
    | a=t_primary '.' b=NAME !t_lookahead { _Py_Attribute(a, b->v.Name.id, Store, EXTRA) }
    | a=t_primary '[' b=slices ']' !t_lookahead { _Py_Subscript(a, b, Store, EXTRA) }
    | star_atom

This currently accepts valid targets and disregards invalid ones, but we would really like it to be able to report the type of the invalid target, in order for the error message to be more descriptive. With the current design, we would do something like this:

star_target[expr_ty] (memo):
    | '*' a=(!'*' star_target) {
        _Py_Starred(CHECK(_PyPegen_set_expr_context(p, a, Store)), Store, EXTRA) }
    | a=t_primary '.' b=NAME !t_lookahead { _Py_Attribute(a, b->v.Name.id, Store, EXTRA) }
    | a=t_primary '[' b=slices ']' !t_lookahead { _Py_Subscript(a, b, Store, EXTRA) }
    | star_atom !t_lookahead
    | a=star_expression { RAISE_SYNTAX_ERROR("cannot assign to %s", _PyPegen_get_expr_type(a)) }

That is currently not possible though, because star_targets get parsed in a (star_targets '=') loop in the assignment rule so that chained assignments get correctly handled, which means that this would set an error for a = False as well, since False would first be tried out as a star_target. This would fail and it would set a SyntaxError, which we wouldn't be able to unset later.

But if we were able to do something like this (I'm omitting the actions, which would need a bit more work):

assignment[stmt_ty]:
    | chained_assignment

chained_assignment:
    | a=star_targets '=' b=chained_assignment
    | ^(InvalidStarTarget) (yield_expr | star_expressions) !'='

star_target[expr_ty] (memo):
    | '*' a=(!'*' star_target) {
        _Py_Starred(CHECK(_PyPegen_set_expr_context(p, a, Store)), Store, EXTRA) }
    | a=t_primary '.' b=NAME !t_lookahead { _Py_Attribute(a, b->v.Name.id, Store, EXTRA) }
    | a=t_primary '[' b=slices ']' !t_lookahead { _Py_Subscript(a, b, Store, EXTRA) }
    | star_atom !t_lookahead
    | a=expression ^InvalidStarTarget(a)

we would be able to unset the exception InvalidStarTarget and parse False as a star_expression in the last alternative, but we would also be able to get the type of the invalid target and output a descriptive error message, in the case an invalid target was indeed parsed.

Another thought that's been on my mind, since I started working on this, is that it would somewhat significantly raise the bar for new contributors. I don't really know how much of a concern this should be (I don't think there are very many people that aspire to contribute to the parser, when they're starting out), but I'd like to hear your opinion on this.

Another update: I opened lysnikolaou/lepegen, which serves as a "playground" where I'm trying stuff out with a Python hand-written parser, like @gvanrossum suggested above. Take a look at it, if you want, and feel free to push anything you want to it.

gvanrossum commented 4 years ago

Yeah, it makes sense to be able to throw exceptions from any point. Consider an alternative (the rule of which it is part doesn't matter):

    | foo bar baz {action}

We translate roughly that into code like this (skipping mark operations):

    if self.foo() and self.bar() and self.baz():
        return action

Now we add a throw operator; there's no action:

    | foo bar baz ^SomeError

I think the translation will be:

    if self.foo() and self.bar() and self.baz():
        self.throw(SomeError)
        return None

I'm assuming that we're not allowing to catch the error in the same rule, although maybe it doesn't even matter.

Now some other rule (that directly or indirectly calls the rule containing the above alternative) can catch the exception. That rule's grammar has an alternative:

    | ^(SomeError) more stuff {action}

This could translate into something like:

    if self.catch(SomeError) and self.more() and self.stuff():
        return action

I notice in your toy implementation (which I can't git-push to?) that throwing an exception basically erases any other exception that had previously been thrown, and catching an exception clears the pending exception state completely. I would like to propose to use a stack here, where throwing an exception pushes it onto the stack, and catching it removes it from the stack. Oh, and catching something that's present deeper in the stack should pop anything that's on top of it. Something like this (omitting the OO stuff):

    if error in stack:
        while True:
            if stack.pop() == error:
                break
        return True
    return False

What that should do when catching two errors that are both in the stack I don't know -- probably pop until a match on either one, but we should decide by considering an example and seeing what's the most useful. There's also the question of what to do if the same error occurs multiple times on the stack -- the above code only pops until the one nearest the stack top is revealed, but we could also consider popping until it's no longer found anywhere on the stack.

An implication of my proposed behavior is that if catch() returns True but the subsequent more() and stuff() fails, the exception is still cleared. So this would work like a Python except clause. (Hm, would we need re-throw syntax? :-) That's different though from your code which only clears the exception when the rest of the alternative is successfully recognized. I'm not sure what's best, we should consider examples. (Oh how I miss a whiteboard!)

One thought: If catch() unconditionally clears the exception, you get only one chance to catch it. Of course, we could always write | ^(SomeError) (more | stuff), which would catch the error and then succeed if either more or stuff succeeds. And to re-throw if both fail, we could write | ^(SomeError) (more | stuff | ^SomeError).

FWIW, I'm sticking to your notation for now, but I'm itching to write it more verbosely, e.g.

    | foo bar a=baz THROW<SomeError, a>

and

    | CATCH<SomeError> more stuff {action}

just so it's clear what's what. (Another possible set of terminology: fail and recover?)

We also need to look into what we should report if multiple exceptions are pending on the stack. I guess we should report the error found at the very bottom of the stack? Because in the stack model that's the oldest (unrecovered) one.

Maybe I should try to read that paper. :-)

gvanrossum commented 4 years ago

PS. Could throw just be a function called in the action? There can't be any other action when we throw.

gvanrossum commented 4 years ago

Huh, skimming the paper I realized that our current strategy of reporting an error at the farthest point the tokenizer has reached is the original approach to error reporting by Ford. I still have to understand the semantics the paper proposes for throw and catch though...

lysnikolaou commented 4 years ago

which I can't git-push to

Whoops, forgot to invite you and @pablogsal as collaborators. It's done now.

gvanrossum commented 4 years ago

I have an important question (I think).

If we have a grammar like this:

start:
    | foo foo
    | bar bar
foo:
    | NAME
    | NUMBER THROW<Error>
bar:
    | NAME
    | NUMBER

And we parse this input:

abc 123

When this is parsed, the first alternative (foo foo) matches but on the second token (123) the foo rule throws.

Question: does the thrown error cause the second alternative for start (bar bar) to be tried or not? It would succeed, so this matters.

If the answer is yes, I have a follow-up question. Suppose the bar rule is different, for example:

# Rules 'start' and 'foo' are the same as above
bar:
    | NUMBER

This would fail on the input abc 123. Now the question is, at that point, does the thrown error prevail (causing start to appear to throw the error that was thrown by foo), or is the thrown error discarded and do we end up with a "standard" failure?

If the answer to the first question is no (i.e., if foo foo throws an error, bar bar is not tried), it seems that the only way to stop an error, once thrown, is an explicit catch that matches the error.

There's a minor follow-up question in this case: can an explicit throw be caught by a catch it the same rule? E.g.

start:
    | foo THROW<Error>
    | CATCH<Error> bar
# The following don't throw:
foo: ...
bar: ...

@lysnikolaou Can you find the answers to my questions in the paper?

There's a final edge case that I really hope is ruled out by the answers. Suppose we have this grammar:

start:
    | foo
    | bar
foo:
    | NAME THROW<Error>
bar:
    | NUMBER
    | CATCH<Error> NAME

And let's try this on the input

abc

Suppose the answer to my first question was yes -- then the error thrown by foo causes bar to be attempted, which reaches the catch operator. My question is then: Would this catch operator catch the error? If so, that would feel disturbing: It would mean that a catch operator could be reached by an error thrown somewhere before the rule containing the catch was even entered.

All in all I am hoping the answer to my first question is "no" -- it would make it much simpler to understand the semantics.

lysnikolaou commented 4 years ago

These are all very good question I've been pondering with in the last days.

@lysnikolaou Can you find the answers to my questions in the paper?

First of all, if I understand everything correctly, the paper answers none of these questions. And that's because the paper does not propose the catch operator to be optional, but for it to accompany every ordered choice operator in the grammar. I don't think that's really something we can do at this point, since we would have to change every single | operator in `python.gram.

What we will have to do is decide on what the default behaviour should be, if an ordered choice operator is not followed by a catch. The way I see it, we have three different possibilities here:

The | operator catches all exceptions by default, which means that the alternative is tried out and the set exception is either cleared or maintained.
The | operator catches no labels by default, which means that the ordered alternative immediately fails, if an exception is set.
We pre-define a subset of all the exceptions and let the ordered choice operator only catch this specific subset of exceptions by default.

If we choose 1, then the answer to your question is yes. If we choose 2, it's no. I agree that 2 would have the clearest semantics of the three, but that would also include much more work for everything to work correctly, since we would have to include more catch operators. Generally, I think that this is maybe the most important design decision we will have to make and I hope that trying stuff out in lepegen will give us a clearer picture of what's the most sensible thing to do.

gvanrossum commented 4 years ago

I think I've figured out what the paper does. Labels are propagated unless caught. A "default" failure throws a special label fail which is caught by the "default" / operator (i.e. without catch clause). Once caught, a label stays caught even if the alternative picked by the catch clause fails -- whatever that alternative produces (success, fail or label) holds. But if that's not success, following alternatives will still be tried (allowed to catch it). In a sense there's nothing special about the fail label except that it's thrown by a failing terminal and it's caught by any | that doesn't have an explicit catch clause.

So now I have my answers:

In my foo foo | bar bar example, if the second foo throws, bar bar is not tried because there's no catch clause.
In my "minor follow-up" question, a throw can certainly be caught by a catch in a later alternative in the same rule.
In my "final edge case", bar will not be tried so it won't have to handle the error thrown by foo.

This makes me happy.

gvanrossum commented 4 years ago

There are some other details I glanced from the paper as well:

Repetitions only succeed when the repeated part throws the default fail label -- if it throws some other label, that prevails.
A negative lookahead succeeds if the attempted rule throws the default fail label only. If the attempted rule succeeds, the negative lookahead throws the default fail label, and if the attempted rule throws some other label, that gets thrown.
I think we have a resolution for my question regarding your original import_from_targets example:
```
import_from_targets[asdl_seq*]:
| import_from_as_names
| ^(InvalidImportFromTarget) '(' import_from_as_names [','] ')'
| ^(InvalidImportFromTarget) '*'
```
Here, if InvalidImportFromTarget is thrown by the parenthesized import_from_as_names in the second alternative, it will certainly be caught by the third alternative's catch clause (and then the attempted match on '*' will fail "normally", i.e. throw fail).

gvanrossum commented 4 years ago

One comment from the paper worries me.

In the case of Typed Lua grammar, we defined almost 50 different labels

Assuming that even Typed Lua has a simpler grammar than Python, this worries me. And IIUC they don't catch any labels -- they only *throw** them. At least the example in the paper doesn't. And it seems that catching labels is only useful once you want to do error recovery.

I just looked at https://github.com/python/cpython/pull/19911 ("produce specialised errors for del") and ISTM that we probably would have to do the same amount of work -- the only potential saving would be defining the invalid_del_target rule and its action ({ RAISE_SYNTAX_ERROR(...) }), but instead of that we'd end up having to use some other way to map the label to the error message, so even that's marginal.

Which is why I reviewed PR 19911 and said I'd approve and merge it once the conflict is resolved.

lysnikolaou commented 4 years ago

Well, it seems that my understanding of the paper wasn't deep enough, although I had read it multiple times. Thanks for reading it through and clarifying all this stuff. It is very useful.

Assuming that even Typed Lua has a simpler grammar than Python, this worries me.

Indeed, this is somewhat worrying. It is certainly going to be lots of work and every time we discuss about it, it is even more apparent to me that it will need an exceptional amount of beforehand thinking on what labels to introduce, where to throw and where to catch them.

just looked at python#19911 ("produce specialised errors for del") and ISTM that we probably would have to do the same amount of work -- the only potential saving would be defining the invalid_del_target rule and its action ({ RAISE_SYNTAX_ERROR(...) }), but instead of that we'd end up having to use some other way to map the label to the error message, so even that's marginal.

That is correct.

I guess the most important question that we need to ask ourselves is this: Do we want to go through with it, in case we want to also implement error recovery in the future and maybe target Python 3.10? If not, we would need to do something similar to python#19911 for star_targets. One minor problem with this is that currently the error location of SyntaxErrors thrown in the grammar is often not correct, because the whole expression needs to be parsed, before the action calls RAISE_SYNTAX_ERROR.

pablogsal commented 4 years ago

Assuming that even Typed Lua has a simpler grammar than Python, this worries me.

In this paper they mention something on those lines:

An anonymous reviewer pointed out that manually inserted cut points and labeled failures make the grammar completely unreadable. One has to find a way of conveying the information in another way. This can, for example, be done in semantic procedures, which in some parser generators (such as the author’s ”Mouse”) are separated from the grammar

Seems that with actions, lookaheads, cuts and potentially with labels the grammar can be more challenging to read :S

One minor problem with this is that currently the error location of SyntaxErrors thrown in the grammar is often not correct, because the whole expression needs to be parsed, before the action calls RAISE_SYNTAX_ERROR.

Yeah, this is what worries me the most. Some times it is somehow acceptable, but some others is completely off :(

This also connects with the problem of displaying the grammar in https://github.com/python/cpython/pull/19969.

swhobbit commented 4 years ago

I'm not a language jockey, merely a moderately experienced user. Here's my take writing code interpreted by Python. Call it the 10,000 foot user view.

do we want to go with the generic SyntaxError: invalid syntax message

NO, you don't. You need to hunt down and kill every occurrence of the SyntaxError: invalid syntax message, replacing each with a coherent detailed error message that says what the interpreter finds invalid.

Every time that message pops up, you have the person at the terminal having to parse the line and area around it (lest the error was caused by previous line) from scratch. Consider:

   >>> if a == b
  File "<stdin>", line 1
    if a == b
            ^
SyntaxError: invalid syntax

Pardon me, but b (indeed everything on actually on the line) is valid syntax. That the semicolon is missing is the problem, and in a helpful world python have announced missing colon on if statement.

In short, don't say there is a problem, say what the problem is. Anything less costs the the developer (your end user) time.

pablogsal commented 4 years ago

That the semicolon is missing is the problem

Sadly, is not that simple. The parser does not know that that is the thing you know is missing and the parser indeed can expect many other things there. For instance, take a look at this issue:

https://bugs.python.org/issue40599

In there we explore some way for the parser to communicate what it was expecting and as you can see the errors don't get much better and some times it can be misleading.

swhobbit commented 4 years ago

Sadly, is not that simple.

This is not news. 90% of programs are often error checking/reporting.

I did scan https://bugs.python.org/issue40599. I'll note those more verbose messages, even the ones out in left field, give me more of a hint than the bare syntax error message.

Quoting that issue:

The beauty of Python's detail-free syntax error is that it doesn't tell you what it expects ... and it requires the user to understand how the parser works to interpret the error message.

That's only a good attitude if the interpreter authors are also the only end users. And they are not. To be plain: There is no beauty in that error message for an end user.

When my spouse is working in Python as she fools around with the Google Voice Kit on a Raspberry Pi, she does not know or care how the parser works. (She's MIT Alum, but Mat Sci, not Comp Sci.) And the interpreter must generate informative messages both for her and for high freshmen.

Here's another take on that error from a different interpreter:

 2 +++  if rc = r
Error 14 running SAMPLE EXEC, line 2: Incomplete DO/SELECT/IF

That's from REXX running under IBM VM/SP 5, which was released in the mid-1980s. (It's literally running on a Museum piece, https://www.livingcomputers.org/)

The Python team needs stop saying it "is not that simple". What needs to said is:

Even though the language is more complex today, the Interpreter *will* issue messages as good (if not far better) as code from 35 years ago.

pablogsal commented 4 years ago

The Python team needs stop saying it "is not that simple".

Unfortunately, we are all volunteers here doing the best we can in our free time with no sponsoring of any kind. REXX was developed at IBM by an IBM employee paid by IBM.

Python, on the other hand, is maintained by unpaid volunteers and the source is not only open but anyone can contribute. If someone thinks is fundamental that Python has much better error messages, is welcomed to contribute so we can have much better error messages. Telling unpaid volunteers that they need to do something is sadly not very encouraging and certainly not the best way of convincing.

swhobbit commented 4 years ago

Paid versus OSS doesn't always make it better. REXX is from same people who gave us:

IEC161I 062-086,APPC,APPC,SYS00003,,,SYS1.APPCSI,SYS1.APPCSI.DATA, IEF196I IEC161I CATALOG.OS390.MASTER

In any case, the topic of this defect is not software for hire versus OSS software, but rather Better error-reporting in CPython. I've offered the input of an experienced user. Do with it what you will.

pablogsal commented 4 years ago

Thanks, we certainly will take into account.

srkunze commented 4 years ago

Even though I can understand the intention behind @swhobbit’s comment, I fear that it is unrealistic.

The simple reason is that in order to help in case of a syntax error, the parser would need to know exactly what a coder wanted to achieve in terms of the AST.

One way to do this would be to find common resolutions for syntax errors. So, I am thinking more in terms of a spelling correction/machine learning that tries to anticipate what the coder wanted in specific circumstances. I doubt, though, that there’s real data available to train some ML models.

Another way to enrich the parsing would be the proposed exception handling for grammars but they are based on the gut feeling of the language designers only and quite verbose.

swhobbit commented 4 years ago

The simple reason is that in order to help in case of a syntax error, the parser would need to know exactly what a coder wanted to achieve in terms of the AST.

This is incorrect. Zero information is worse than limited information. Don't let perfect be the enemy of good.

Even unique error numbers would be an improvement over what is printed now.

"Mr. Scott cannot give me exact figures, Admiral, so... I will make a guess." -- Mr. Spock (Star Trek IV)

gvanrossum commented 4 years ago

This issue has become toxic. Closing. We'll continue the discussion elsewhere.

we-like-parsers / cpython

Labeled Exceptions for pegen - Better error-reporting in CPython #123

Labeled exceptions in Pegen

New grammar syntax

New throw operator

Parametrized bar operator

Parametrized terminal nodes

Implementation details

Conclusion and things to consider

New `throw` operator

Parametrized `bar` operator