izimobil / polib

Pure python library to manipulate, create, modify gettext files (pot, po and mo files).
MIT License
102 stars 29 forks source link

List of occurrences wrapped differently than xgettext #137

Open gamboz opened 1 year ago

gamboz commented 1 year ago

I think that xgettext --no-wrap does not apply to the list of occurrences, so polib and xgettext will format the #: lines differently.

If you have a structure like the following:

.
├── f
│   └── c.py
├── folder_medium_length
│   └── b.py
├── folder_name_with_a_very_long_name_and_in_any_case_longer_than_seventy_eight_characters
│   └── a.py

where all the files contain the same string to translate, polib with wrapwidth=0 gives something like

#: /tmp/x/folder_name_with_a_very_long_name_and_in_any_case_longer_than_seventy_eight_characters/a.py:4 /tmp/x/folder_medium_length/b.py:4 /tmp/x/f/c.py:4

while xgettext --no-wrap something like the following (that was somewhat unexpected for me :)

#: /tmp/x/folder_name_with_a_very_long_name_and_in_any_case_longer_than_seventy_eight_characters/a.py:4
#: /tmp/x/folder_medium_length/b.py:4
#: /tmp/x/f/c.py:4

I think this can be changed by editing polib.py l.1037 from

            if wrapwidth > 0 and len(filestr) + 3 > wrapwidth:

to

            if wrapwidth > 1 and len(filestr) + 3 > wrapwidth:

I'm not yet opening a PR, because these are my first steps in the "translations" world and I might be making stupid mistakes (so please be patient :slightly_smiling_face: )

armisael commented 3 weeks ago

Let me add a few more info on this: we use Django, which underneath uses xgettext; I tried the fix suggested by @gamboz but it didn't work: it is correct that when using --no-wrap, xgettext does still wrap "comments" (the original string location), but what worked for us was not to edit wrapwidth > 1, but rather to hardcode a width of 79 characters in polib.py#L1037:

1037,14c1,11
<             if wrapwidth > 0 and len(filestr) + 3 > wrapwidth:
<                 # textwrap split words that contain hyphen, this is not
<                 # what we want for filenames, so the dirty hack is to
<                 # temporally replace hyphens with a char that a file cannot
<                 # contain, like "*"
<                 ret += [line.replace('*', '-') for line in textwrap.wrap(
<                     filestr.replace('-', '*'),
<                     wrapwidth,
<                     initial_indent='#: ',
<                     subsequent_indent='#: ',
<                     break_long_words=False
<                 )]
<             else:
<                 ret.append('#: ' + filestr)
---
>             # textwrap split words that contain hyphen, this is not
>             # what we want for filenames, so the dirty hack is to
>             # temporally replace hyphens with a char that a file cannot
>             # contain, like "*"
>             ret += [line.replace('*', '-') for line in textwrap.wrap(
>                 filestr.replace('-', '*'),
>                 width=79,
>                 initial_indent='#: ',
>                 subsequent_indent='#: ',
>                 break_long_words=False
>             )]

I'm not sure this works on all use cases, though; with Django it worked.