python-babel / babel

The official repository for Babel, the Python Internationalization Library
http://babel.pocoo.org/
BSD 3-Clause "New" or "Revised" License
1.34k stars 448 forks source link

Check pofile string delimiters #1151

Open rtobar opened 2 weeks ago

rtobar commented 2 weeks ago

This PR adds checks to the pofile parser code to validate that message strings are correctly delimited by double quotes. Keeping with the current design, an error is only raised if requested, otherwise a warning is printed, the faulty lines are corrected and parsing goes on.

I found this issue while processing a pofile used in the Spanish translation of the CPython documentation. One of our files was incorrectly written, and from all our tooling only the msgcat tool of GNU's gettext package complained, while babel, polib and others didn't. See https://github.com/python/python-docs-es/pull/2873, https://github.com/izimobil/polib/pull/161 and https://git.afpy.org/AFPy/powrap/pulls/4 for further reference.

While implementing this change I found that the _NormalizedString class not only was used to contain message lines, but also participated in the parsing process (and hid some parsing as well). I thus broke down my changes into three separate commits:

Along the way I also implemented three small quality-of-life changes. They are included as the first three commits of this PR, happy to submit these separately if required:

rtobar commented 1 week ago

Gentle ping, at least to kick off CI and check if there's any obvious mistakes to be fixed