Closed lysnikolaou closed 4 years ago
I hope Pablo can review this... I'm not at all confident I understand the f-string code.
There's one thing that concerns me though -- why focus all attention on the column offset? Isn't the line number also subject to movement? E.g. in
f"""blah
{
expr)
}
"""
Don't we want the error to point to the )
rather than to the start of the f-string?
I hope Pablo can review this... I'm not at all confident I understand the f-string code.
There's one thing that concerns me though -- why focus all attention on the column offset? Isn't the line number also subject to movement? E.g. in
f"""blah { expr) } """
Don't we want the error to point to the
)
rather than to the start of the f-string?
That's pretty much impossible at the moment. All the shifting around only happens after the complete AST tree has been generated. Thus, if the parsing fails, these function don't even get called.
I think, that our best shot at getting this 100% right is rewriting the whole thing, which would be quite a big project. In my imagination, we could have a grammar for f-strings and in concatenate_strings
we would just concatenate the string literals together and then call the f-string parser on the big concatenated string if needed. In case we decide this is important, I could slowly start trying things out in that direction.
Thoughts, @gvanrossum @pablogsal?
why focus all attention on the column offset? Isn't the line number also subject to movement?
Because it seems that line number movement is ok, at least in comparison to what CPython does.
That's pretty much impossible at the moment. All the shifting around only happens after the complete AST tree has been generated. Thus, if the parsing fails, these function don't even get called.
Check out the WIP in: https://github.com/gvanrossum/pegen/pull/208. I have tried to provide better error messages for f-strings there. Even if we don't merge it in its entirety, that PR fixes some errors when parsing strings and not files (the filename is <string>
) that have syntax errors. This PR still needs some work to handle multi-line f-strings because now it does not point correctly to the line inside the f-string but it may be interesting.
I quickly checked the PR and the SHIFT_EXPR
and SHIFT_ARG
looks :+1: under a preliminary inspection but I need to certainly dedicate more time to understand what's going on here:
https://github.com/gvanrossum/pegen/pull/207/files#diff-e2161640949755a51e055a22cf4af0fdR489-R505
Could you maybe add some comments there for context meanwhile?
I think we should strive for parity with CPython first.
I agree that in the long run we may have to rewrite f-string parsing from scratch -- and Eric V Smith seems to agree (see https://bugs.python.org/issue39564), but that's going to require changes to the tokenizer. If this gets done in 3.9 we'll have to track that, but my money is on 3.10 or later, given the complexity of the edge cases (things like ...{x:4d}...
and ...{x = :.5g}...
-- and note that currently the unparenthesized walrus doesn't work, e.g. ...{x := 1}...
).
I think we should strive for parity with CPython first.
Good point.
but that's going to require changes to the tokenizer
I think it will also likely require us to maintain a stack of tokenizers (so you can alternate between parsing f-strings and "regular Python").
I quickly checked the PR and the
SHIFT_EXPR
andSHIFT_ARG
looks 👍 under a preliminary inspection but I need to certainly dedicate more time to understand what's going on here:https://github.com/gvanrossum/pegen/pull/207/files#diff-e2161640949755a51e055a22cf4af0fdR489-R505
Could you maybe add some comments there for context meanwhile?
Done.
Thanks a lot for the review @pablogsal!
I mark this as WIP, because I'm not confident enough that this covers all cases. Any ideas for more tests?
Fixes #203.