Open tomasr8 opened 1 month ago
Excuse me.
I am reading the source code of parse.c
ane notice the function assignment_rule
.
I am confused about the annotation of
// assignment:
// | NAME ':' expression ['=' annotated_rhs]
// | ('(' single_target ')' | single_subscript_attribute_target) ':' expression ['=' annotated_rhs]
// | ((star_targets '='))+ (yield_expr | star_expressions) !'=' TYPE_COMMENT?
// | single_target augassign ~ (yield_expr | star_expressions)
// | invalid_assignment
What's the meaning of these patterns, for example, ((star_targets '='))+ (yield_expr | star_expressions) !'=' TYPE_COMMENT?
What is star_targets
and star_expressions
? Could you provide some help to understand it?
See https://docs.python.org/3/reference/grammar.html#full-grammar-specification to understand the grammar. The star_targets
and star_expressions
are defined later in the document. CPython mixes EBNF and PEG syntax just for their own convenience.
As for the issue, I think you can replace it. On the other hand, I wonder whether this could cause a slight performance change where the parser needs to make another pass to subsitute the rule or not.
EDIT: nvm the performance shouldn't really be the bottleneck here (in the relevant PR, the occurrences were just outright replaced)
I reviewed several functions in parse.c
, like atom_rule
. I find it recursively checks the type of atom tokens, such as NAME
, True
, False
, None
, Number
, etc. It seem like really time-consuming.
Is it really nessesary to be designed like this?
Is it really nessesary to be designed like this?
If you want a more in-depth rationale, read PEP 617. If you can find a way to make an equivalent parser but simpler, we'll gladly accept it. Long story short, we moved from an LL(1) parser to a PEG one because left recursion is not supported by LL(k) parsers (actually, it can cause an infinite recursion on PEG parsers but LL(k) parsers do not allow them in their grammar). Another reason why we use a PEG parser is that a CFG parser would have ambiguous cases which a PEG does not by essence.
Thx for your explanation. I will take a deeper look at the parser.
I used to assume the parser works when executing python program. But when I checked CPython source codes, I recognized parsing Python codes only occurs before generating .pyc
. Based on this, the speed of parsing Python codes does not make a big deal. Right?
Depends on the use I'd say but here it's probably negligible, though I'm not very familiar with how the pieces are generated. Nonetheless, simplfying how the grammar looks for the user is a good thing IMO.
Feature or enhancement
Proposal:
I don't know all the nuances of the new grammar but I think that the
assignment
rule can be simplified in the following way:That is, just reusing
annotated_rhs
instead of writing out(yield_expr | star_expressions)
.There was a similar issue before that fixed some cases, but this one remained. I can submit a PR if this is considered worthwhile.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
https://github.com/python/cpython/issues/116988