antlr / grammars-v4

Grammars written for ANTLR v4; expectation that the grammars are free of actions.
MIT License
10.22k stars 3.71k forks source link

Fix case *_ error #4280

Closed Willie169 closed 2 weeks ago

Willie169 commented 1 month ago

Fix that case [a, *_] if a == 0: throws error rule soft_kw__not__wildcard failed predicate: {this.isnotEqualToCurrentTokenText("_")}?

Willie169 commented 1 month ago

Wait a minute. I forgot to test targets other than Java and add test for the fixed error.

kaby76 commented 1 month ago

None of the examples/*.py tested as_pattern, closed_pattern, star_pattern, or double_star_pattern before. Yes, please add these so I can test the ambiguity. (The grammar before your changes has ambiguities.).

Willie169 commented 1 month ago

Ok, I will write examples for them

Willie169 commented 1 month ago

I've added examples from Python 3.12 Standard Lib, examples written by me and ChatGPT, and Cpp target. I am not sure whether all targets work correctly.

Willie169 commented 1 month ago

test not passed: _bootstrap_external.py, _test_eintr.py, dump.py,testinterpchannels.py, testinterpreters.py,test_array.py, test_builtin.py, test_clinic.py,test_compile.py, test_exceptions.py, test_frame.py,test_fstring.py, test_imaplib.py, test_logging.py,test_opcache.py, test_os.py, test_posix.py,test_regrtest.py, test_str.py, test_subprocess.py,test_sys.py, test_tabnanny.py, test_traceback.py,test_type_params.py, test_venv.py

RobEin commented 1 month ago

Fix that case [a, *_] if a == 0: throws error rule soft_kw__not__wildcard failed predicate: {this.isnotEqualToCurrentTokenText("_")}?

I was able to reproduce the error. This looks like a target independent bug. I'm working on it.

RobEin commented 1 month ago

Adding a CPP port is fine, although I am working on such a port as well.

The problem is that you are trying to overwrite the newer one with an older version, which causes a regression in many files. You want to overwrite Python 3.12.6 with an older Python 3.12.1. e.g. README.md

I don't see where you fixed the rule soft_kw__not__wildcard failed predicate error.

Please explain the many changes in PythonParser.g4.

Willie169 commented 1 month ago

The real change that solve the problem is:

star_pattern
    : STAR NAME;

Other changes are useless and you can change them back.\ And I think what causes the error may be that after the token is recognized as star_pattern and then as pattern_capture_target, it can't return to star_pattern when finding out it's not soft_kwnotwildcard.
The escape quotes character in f-string can't be recognized, to reproduce:

f"\"", f'\''

I haven't solved it yet.
Btw, I haven't checked whether the Cpp port works but I will be busy on exams since today's midterm and probably until the 2nd GSAT mock test, so feel free to take over it or write another one.

kaby76 commented 1 month ago

Some fixes are in order.

RobEin commented 1 month ago

Thanks for the feedback about the two errors. I tested your modified star_pattern rule and it really fixes the "soft_kwnotwildcard failed predicate" error:

star_pattern
 : '*' NAME;

I'll even look into your further simplifications on this. Unfortunately, many problems can occur with semantic predicates. I will try to remove it from the parser grammar and replace it in the lexer grammar in some way.

I am also working on the "escape quotes character in f-string" error: f"\"", f'\''

RobEin commented 1 month ago

Looks like I managed to find and fix the "escape quotes character in f-string" bug. I'll test it for a few more days.

Please change only the things related to the error "_softkwnotwildcard failed predicate" in PythonParser.g4 in the new PR. I did not use token names in the parser grammar so that it resembles the official python.peg grammar as much as possible, and I think that the grammar is also more readable. Please make the new PR based on the latest grammars-v4 repo.

Take PythonLexerBase.cpp out of the current PR for the time being, on the one hand because you haven't tested it, on the other hand, because the Cpp runtime must be completed first. For example, the following methods are missing from Lexer.cpp:

Once you have the runtime modification, you can test PythonLexerBase.cpp. And you can only make a PR about PythonLexerBase.cpp if the modified Cpp runtime has already been published. Please let me know if you can do it.

Willie169 commented 4 weeks ago

I will make this PR only adding star_pattern and delete others for now. And I will work on the Cpp after the exam if you haven't started that then. Thanks. Btw, the PR will be a part of my Learning Portfolio, something mandatory and counted in scores to hand in to universities when applying to back here.

teverett commented 2 weeks ago

@Willie169 thanks!