pygettext: use an AST parser instead of a tokenizer

python / cpython

The Python programming language

https://www.python.org

Other

63.11k stars 30.22k forks source link

pygettext: use an AST parser instead of a tokenizer #104400

Open tomasr8 opened 1 year ago

tomasr8 commented 1 year ago

Follow up on this forum discussion

This is a part 1/X of improving pygettext. Replacing the tokenizer that powers the message extraction with a parser will simplify the code (no more counting brackets and f-string madness) and make it much easier to extend it with new features later down the road.

This change should also come with a healthy dose of new tests to verify the implementation.

PR coming shortly ;)

Linked PRs

gh-104402
gh-108173

terryjreedy commented 1 year ago

We replaced the custom parser in pyclbr with an ast visitor a couple of years ago. It was shorter and clearer and agreed to be a definite improvement. For this type of application, any possible slowdown is irrelevant.