smhg / gettext-parser

Parse and compile gettext po and mo files, nothing more, nothing less
MIT License
158 stars 44 forks source link

Sorting of entries with the same msgid isn't consistent #41

Closed probertson closed 6 years ago

probertson commented 6 years ago

A project I'm working on has instances of translation messages where the msgid is the same, but they have different msgctxt values. We use a script that crawls our code and extracts the strings, then uses this library's po.compile() to turn them into PO file structure before writing them to disk. We use the sortByMsgid: true option to attempt to minimize the changes to the PO files.

Apparently the way our script reads the files and constructs objects to make the PO file isn't consistent, because I've noticed that when there are pairs of strings with the same msgid but different msgctxt values, sometimes the order of the two entries changes in the PO file, even though those strings weren't modified in the source.

This PR provides a solution to this issue (and accompanying test) by modifying the sorting algorithm to first sort by msgid, and then, if the msgids are equal, to sort by msgctxt. In addition to the test I added in this PR, I have tested this change with the project I mentioned, and it fixes the issue of "moving msgctxt values".

probertson commented 6 years ago

After I submitted this, it occurred to me that this is potentially a breaking change. I can't think of a reason someone would rely on the existing behavior, but I can't rule it out, obviously.

As an alternative, I could implement this in a way that is backwards compatible by adding a new option instead of reusing the sortByMsgid option. I've thought of a couple of ways this could be done:

Let me know if this is a concern, and if you have a preference for any of these alternatives.

Thanks!

smhg commented 6 years ago

This looks really useful. First of all: we'll release this as a major version so you can break things.

My preference is the sort function parameter. 2 concerns:

I'd be great if you could have a look at this. Otherwise I'll investigate in a few days.

probertson commented 6 years ago

Thanks for your feedback. This sounds great -- I believe I'll have time to take a look in the next few days.

probertson commented 6 years ago

I looked into the sorting behavior of the gnu xgettext command (and actually it's the same for all the gnu gettext commands).

So as it exists now, I'm guessing that sortByMsgId is similar to --sort-output.

probertson commented 6 years ago

@smhg: I pushed up an update to the code based on your suggestions. Let me know if I misunderstood, or if you prefer something else.

The changes are:

I also updated the README and tests to match.

smhg commented 6 years ago

This is an awesome contribution in every aspect. Thank you so much!

smhg commented 6 years ago

Released as 2.0.0.

probertson commented 6 years ago

Thanks @smhg !