neogeny / TatSu

竜 TatSu generates Python parsers from grammars in a variation of EBNF
https://tatsu.readthedocs.io/
Other
408 stars 48 forks source link

Generated `main()` passes the wrong arguments to `parser.parse()` #263

Closed dnicolodi closed 2 years ago

dnicolodi commented 2 years ago

The main() of a generated parser calls the parser parse() method like this:

                    return parser.parse(
                        text,
                        rule_name=start,
                        filename=filename,
                        **kwargs
                    )

however, the parse() method, however the right parameter name is start (or start_rule, one is kept for backward compatibility, but I don't know which one) and not rule_name.

This breaks using the generated parser module as a script, at least in TatSu 5.7.3, but it seems that the current git has the same problem.

In general, it would be nice if the parameter names were validated somewhere.

apalala commented 2 years ago

I think this issue is fixed on the master branch. Could you check?

dnicolodi commented 2 years ago

The code in master does not seem different. This is what is generated parser.parse https://github.com/neogeny/TatSu/blob/master/tatsu/codegen/python.py#L587 and after a track load of indirection, the parameter is passed to ParserConfig which does not know anything about a rule_name parameter https://github.com/neogeny/TatSu/blob/master/tatsu/infos.py#L23. Alternatively, just grep for rule_name in the code and you will see that there is just one occurrence of it.

apalala commented 2 years ago

There are mentions of rule_name all over the place, including the docs and tests.

I think that the solution is to change everything to start, and add make PaserConfig recognize rule_name for backwards compatibility.

dnicolodi commented 2 years ago

Sorry, I was looking at the sources checked out a the v5.7.3 tag... The bug is indeed not there on git master. Given that TatSu 5.8 and later do not support Python releases older than 3.10, will the 5.7 branch see bugfix releases?

apalala commented 2 years ago

The 5.7 branch may have bugfix releases if someone posts the pull request :-)

Yet Python 3.9 just reached its end of bugfix cycle. Why not use Python 3.10?

As main maintainer for TatSu, and with Python 3.11 being published in October this year, I can't find enough time to maintain compatibility with deprecated versions of Python.

The bug is there in the current master, but it's mostly a deprecation bug:

$ ag -w rule_name
test/parsing_test.py
76:        ast = tatsu.parse(grammar, "test", rule_name='start')
87:        ref_lowercase_result = tatsu.parse(grammar.format(rulename='reflowercase'), test_string, rule_name='start')
88:        ref_uppercase_result = tatsu.parse(grammar.format(rulename='Refuppercase'), test_string, rule_name='start')
90:            result = tatsu.parse(grammar.format(rulename=rulename), test_string, rule_name='start')
93:            result = tatsu.parse(grammar.format(rulename=rulename), test_string, rule_name='start')

tatsu/bootstrap.py
1437:        rule_name=start,

tatsu/codegen/python.py
587:                        rule_name=start,

docs/use.rst
248:    ast = parser.parse('text to parse', rule_name='start')
261:    model = parser.parse(text, rule_name='start', semantics=MySemantics())
273:    model = parser.parse(tokenizer, rule_name='start', semantics=MySemantics())
(tatsu) [master]
dnicolodi commented 2 years ago

Far from me to question your decisions or ask you to put in extra work, but the most relevant Python libraries aim at supporting all security supported Python releases. In this sense, Python 3.7 is supported till June next year and is far from deprecated. I follow the same principle for my "serious" projects. Supporting the Python version contained in the latest Debian stable and the last two Ubuntu LTS releases seems reasonable.

dnicolodi commented 2 years ago

The bug is there in the current master, but it's mostly a deprecation bug:

Do you mean that the remaining mentions of rule_name should be removed? I can prepare a patch for this.

dnicolodi commented 2 years ago

By the way, which features supported by Python 3.10 only are used in the codebase and where? I just ran the tests on master with Python 3.9 and they all pass.

apalala commented 2 years ago

Python 3.9 just had its last bugfix release.

I don't think that there is any >=3.10 only code in TatSu, but it makes sense, as a maintainer, to synchronize with the Python release cycle.

Come October this year, TatSu should be free to use all Python 3.11 features, so the docs should state that compatibility with previous versions of Python is not maintained.

dnicolodi commented 2 years ago

Your project and your rules, but requiring python 3.10 when there is no actual dependency to any python 3.10 feature seems to put unnecessary requirements to the users of otherwise a quite fine tool. Also, your definition of currently supported Python versions is different from the one used by most widely used python libraries.

apalala commented 2 years ago

Python 3.9 Release Schedule:

https://peps.python.org/pep-0596/

dnicolodi commented 2 years ago

Where does it say that Python 3.9 is deprecated? It says that security support is provided till October 2025. However, it is evident that you have your opinion on the support status of Python releases and it will not change. Maybe we can approach this from another perspective: what problem did it solve to restrict the Python version supported by TatSu to Python 3.10 only?

dnicolodi commented 2 years ago

The bug is there in the current master, but it's mostly a deprecation bug:

Do you mean that the remaining mentions of rule_name should be removed? I can prepare a patch for this.

I would also like to go back to the topic of this ticket. What should be the solution for the start vs rule_name argument name inconsistency?

apalala commented 2 years ago

We should settle for start and leave backwards compatibility for rule_name in PaserConfig.

apalala commented 2 years ago

Because most software written under Python 3.7+ will run unaltered on Python 3.10, there are almost no reasons for libraries to maintain backwards-compatibility by avoiding mostly the new, interesting and useful syntax and features.

Since Python.org decided on the yearly release schedule, even OS releases have been working to keep up.

As main maintainer for TatSu I don't want to forgo the new language and library features. Yet, I will release branch versions for backwards compatibility if the pull requests are provided.

dnicolodi commented 2 years ago

Please have a look at #273 for fixes around the handling of start vs rule_name and to #270 for a first contribution toward supporting more Python versions.

dnicolodi commented 2 years ago

Fixed by #273