Open savannahostrowski opened 2 months ago
Continuing the discussion here since the other issue was closed.
Quoting Serhiy:
I thought that we should run the parser with different options to produce different error messages with installed special translator that transforms the original text in simply recognizable way. But there may be better ways.
I was originally thinking the same, but to me that seems like a lot of effort compared to the benefit we'd get. Instead, I think that using a message extractor (e.g. pygettext) we can assert both the presence of a specific message and the line number it appears at. This should give us pretty much the same confidence as directly checking the argparse output but with an order of magnitude less work. What do you think?
and the line number it appears at
Would not this make the tests extremely fragile?
Would not this make the tests extremely fragile?
Does it matter that much if you don't need to fix the tests manually? You'd have a snapshot of the PO file and a simple command to regenerate it so all you'd need to do is double-check the diff
You would need to regenerate the snapshot after every change in argparse
(and it was changed a lot in last few days), and the diff would be so large that it would be hard to notice a real regression in the noise.
But this is a good first step. Before #27667 was merged, Tools/i18n/pygettext.py
produced a warning for _()
used with non-literals. They should be treated as errors in the test. Also, line numbers should be ignored in comparison and only used to report errors. We should just collect a set of translated strings, compare it with the expected result and report about missing or additional strings.
True, if the module is changing a lot, so would the snapshots. I'll try to come up with a PR based on your suggestions :)
We should also add translation tests for getopt
and optparse
, and maybe for other modules (this is separate issues). I am not a fan of converting many tests into packages just to add a data file. This may break history, add conflicts with existing PRs and make future work with tests less convenient. Maybe add a common directory for translation data?
We should also add translation tests for getopt and optparse, and maybe for other modules (this is separate issues)
Agreed, I can take care of that once we're done with argparse.
I am not a fan of converting many tests into packages just to add a data file. This may break history, add conflicts with existing PRs and make future work with tests less convenient. Maybe add a common directory for translation data?
A common folder for all translation data sounds like a good idea!
@serhiy-storchaka The PR also separates the translation tests into its own files which necessitates turning argparse tests into a package. Would you prefer that I merge the translation tests with test_argparse
to avoid that?
Yes, of course.
Thank you for your contribution @tomasr8. If you wish, you can create new issues and PRs for getopt
and optparse
, or just PRs.
I'll look into that :)
It does not catch ngettext
strings.
IIRC pygettext currently does not know how to handle *gettext()
functions which take more than one argument.
There is a way to specify additional keywords via --keyword=XXX
but I don't see any way to tell it how many arguments
it takes.
For example invoking
./Tools/i18n/pygettext.py -o messages.pot --keyword=ngettext Lib/argparse.py
Produces several warnings (and does not extract any ngettext strings):
*** Lib/argparse.py:1639: Seen unexpected token ","
*** Lib/argparse.py:2326: Seen unexpected token ","
We could fix pygettext to allow functions like ngettext. We could probably use the same syntax that GNU's xgettext and pybabel use to specify keywords and their arguments: https://www.gnu.org/software/gettext/manual/html_node/xgettext-Invocation.html#index-_002dk_002c-xgettext-option
For beginning, it should support all gettext
functions by default. Later we can make this customizable.
While #12711 was merged, it seems like there's a separate issue worth tracking around adding translation tests for argparse (see https://github.com/python/cpython/pull/12711#issuecomment-1964754284)
Linked PRs