Closed MatthieuDartiailh closed 2 years ago
I actually figured the issue. I was not checking for the number of args in the invalid_arguments rule when we matched a generator expr. I will do my best to update the PR ASAP.
I have one issue remaining on which I would appreciate some insight @pablogsal @lysnikolaou which relates to the rule real_number
.
This rule is basically invoked when we have the following:
x = 1 - 1j
match x:
case 1 - 1j:
y = 0
In this case we do not have a simple real number which would be handled by signed_number
, and we will use the complex_number
rule which will call the real_number
rule. (line 8573 of https://raw.githubusercontent.com/python/cpython/main/Parser/parser.c). My issue is that currently we check that the real part of the literal is a complex number (through _PyPegen_ensure_real which calls PyComplex_CheckExact) which to me should never be true since I expect the real part of the literal to be parsed as either an int or a float. However it does work in CPython so I must be missing something.
I have one issue remaining on which I would appreciate some insight @pablogsal @lysnikolaou which relates to the rule
real_number
.
Will try to investigate today or tomorrow if @lysnikolaou doesn't do it before. We are currently releasing 3.10.4 and 3.9.12 as we had some problems with the latest release and it may take a while.
I'm not sure what the intent of that part of the grammar is, and sadly, I wasn't around when this was introduced. What I can say though, is that NUMBER
tokens and number expr_ty
s, as a result, contain the whole number, imaginary or real.
Even in the real_number
rule, the NUMBER
that gets parsed and will be passed to _PyPegen_ensure_real
will be the complex number in its entirety, and that's why PyComplex_CheckExact
returns true. You can check parsenumber_raw
for how this is done.
Does that answer your question?
Sorry for the noise.... I missed the absence of a leading ! in _PyPegen_ensure_real. I am not used enough to C. All good now.
In case it was not clear from the status change. This is now ready for review. I would like to get it in before picking up again #41
This attempts to mimic CPython behavior by calling invalid rule only in a second pass to speed up parsing on the happy path. It also includes changes https://github.com/python/cpython/commit/390459de6db1e68b79c0897cc88c0d562693ec5c to avoid calling invalid rules from within without invalid rules.
~~With those changes, I thought I would be able to get rid of having both immediate and deferred raising of exceptions (store vs raise) since looking at CPython it looks to me as if as soon as an exception is reported the parser exits. This fast exit is controlled by the error_indicator attribute on the parser state. However I hit one issue when enabling direct raising in invalid rules (so in CPython terms when looking for details about a syntax error):
f(i for i in range(10))
fails to parse because in t_primary the alternative a=t_primary b=genexp &t_lookahead fails and then the alternative a=t_primary '(' b=[arguments] ')' &t_lookahead trips through the invalid argument rule. This should parse and would mask any actual error.~~I would very much appreciate if somebody could shed some light on this issue.This will need some extra documentation and tests once the above has been addressed.