we-like-parsers / cpython

Here we work on integrating pegen into CPython; use branch 'pegen'
https://github.com/gvanrossum/pegen
Other
1 stars 0 forks source link

f-string parser: Check comments in f-strings #157

Open pablogsal opened 3 years ago

pablogsal commented 3 years ago

In test_fstring the test test_comments fails. We need to check if is due to regular tokenization problems and the test needs updating or we need to fix something regarding comments in f-strings

FFY00 commented 3 years ago

Old:

$ ./python -c 'f"{)#}"'
  File "<string>", line 1
    f"{)#}"
           ^
SyntaxError: f-string: unmatched ')'

New:

$ ./python -c 'f"{)#}"'
  File "<string>", line 1
    f"{)#}"
       ^
SyntaxError: closing parenthesis ')' does not match opening parenthesis '{'

I don't think the new error is good, and seems like a regression, right?

pablogsal commented 3 years ago

Yeah, ideally we should preserve the old error message: the new expression inside should be like a new parse completely regarding matchings.

isidentical commented 3 years ago

We already handle the comments inside the f-strings (it just raises an error), so this seems more of an error message discussion. I'd say the new error message is no worse than the old one, since it sort of makes sense. You probably wouldn't put # inside an expression, so the closing paranthesis is probably meant for the {.

lysnikolaou commented 2 years ago

I think that this would be solved by moving the parenstack into the mode struct which gets pushed/popped for every new tokenizer mode. In general, I think a bigger discussion is needed here for what stays in the top-level tok_state and what needs to be moved into _tokenizer_mode.

pablogsal commented 1 year ago

a bigger discussion is needed here for what stays in the top-level tok_state and what needs to be moved into _tokenizer_mode.

We can probably delay this since at the end this is about the error message and that doesn't necessarily need to be preserved

pablogsal commented 1 year ago

There is an actual problem here as we are allowing comments inside any f-string (multi-lined or not). We may want to change the PEP or the implementation.

CC: @isidentical @lysnikolaou what do you think?

isidentical commented 1 year ago

There is an actual problem here as we are allowing comments inside any f-string (multi-lined or not).

Can you give an example @pablogsal? If someone tries to use a comment inside an f-string expression, than anything after the comment (including the quote character itself) is considered invalid so I don't understand how it is possible to have a valid single-line f-string that also has a comment.

>>> f"{ 1 + 2 # comment }"
  File "<stdin>", line 1
    f"{ 1 + 2 # comment }"
              ^
SyntaxError: f-string expression part cannot include '#'
pablogsal commented 1 year ago

Ah, I forgot to specify how we are allowing them:

>>> f"{ 1 + 2 = # comment }"
  File "<stdin>", line 1
    f"{ 1 + 2 = # comment }"
                            ^
SyntaxError: f-string: expecting '}'

In this case, no error message is being triggered and the comment works. Before we were getting this:

>>> f"{ 1 + 2 = # comment }"
  File "<stdin>", line 1
    f"{ 1 + 2 = # comment }"
                ^
SyntaxError: f-string expression part cannot include '#'

I am using the fstring-grammar-rebased-after-sprint branch.

pablogsal commented 1 year ago

Hummmmm, I apologize, but I think I was using the wrong branch for some reason. I am checking again and seems that this doesn't happen in the last fstring-grammar-rebased-after-sprint