Open armijnhemel opened 3 months ago
When fixing this, please think of :
possibly appearing in a file name as well. An easy test case: I moved lineedit.c
to lineedit:834.c
and then reran xgettext
:
#: lineedit:834.c:834 lineedit:834.c:890 lineedit:834.c:893
msgid "."
msgstr ""
so just splitting on :
might not be the right approach.
Another option would be to use the --strict
option, but that would require a (slight) rewrite of the code, plus it is discouraged:
Write out a strict Uniforum conforming PO file. Note that this Uniforum format should be avoided because it doesn’t support the GNU extensions.
Note: if the goal is to provide each string found in a source code file and report it, but you don't need to necessarily report duplicates, then the current code is of course complete fine.
In the current xgettext implementation I can see at line 115 https://github.com/nexB/source-inspector/blob/9511f56b44ac7c5644b34d413146d58dd9fa7ea0/src/source_inpector/strings_xgettext.py#L115 the following:
This is likely leading to the wrong results, as a line can have multiple instances of
start_line
, which you aren't catching. As an example, I usedxgettext
with the same parameters as you did onlibbb/lineedit.c
from BusyBox:Some of the result lines:
As you can see there are multiple file/line number entries there. It seems that at some point the authors of
xgettext
decided to combine these. Your code does not correctly process these lines: