izimobil / polib

Pure python library to manipulate, create, modify gettext files (pot, po and mo files).
MIT License
100 stars 28 forks source link

Please add support for extracted comments #146

Closed arekm closed 12 months ago

arekm commented 12 months ago

There is support for translator comments (tcomments) but I do not see support for extracted comments.

https://www.gnu.org/software/gettext/manual/html_node/PO-Files.html comments starting with "#." are extracted.

Not sure if some nice structure (like list etc) can be provided for these in case of multiple ones in file.

(one usage scenario for extracted comments here is to use polib to validate translations based on values in some extracted comments like max allowed translation length etc)

Example:

% cat b.c
int main() {
    // TR_MAX_LENGTH: 30
    printf(_("something"));
}
% xgettext --keyword=_ --add-comments=TR_MAX_LENGTH b.c -o x.po
% tail -n 7 x.po
"Content-Transfer-Encoding: 8bit\n"

#. TR_MAX_LENGTH: 30
#: b.c:3
#, c-format
msgid "something"
msgstr ""
# check to see if extracted comments are accessible as tcomment
% python3 -c "import polib; p = polib.pofile('x.po'); print(p[0].msgid); print(p[0].tcomment)"
something

%
# nope

Other example

% cat b.c
int main() {
    // TR_MAX_LENGTH: 30
    // TR_MAX_LENGTH_BLEBLE costam
    printf(_("something"));
}

$ xgettext --keyword=_ --add-comments=TR_MAX_LENGTH b.c -o x.po
...
izimobil commented 12 months ago

Extracted comments are already supported via the comment property of POEntry. With your code example:

$ python3 -c "import polib; p = polib.pofile('x.po'); print(p[0].msgid); print(p[0].comment)"
something
TR_MAX_LENGTH: 30
arekm commented 12 months ago

Oh, nice!

(concatenated string of all extracted comments it seems though aka no nice layout but good enough)

izimobil commented 12 months ago

You can just split the string, something like that would do the trick:

comments = entry.comment.split("\n") if entry.comment else []
arekm commented 12 months ago

Yeah, just noticed that xgettext concatenates multiline comments when extracting, so that should work reliably even for such comments. Thanks.