mezis / git-whistles

A set of bells and whistles for your Git.
Other
64 stars 15 forks source link

git-merge-po hard to understand #42

Open rmoehn opened 9 years ago

rmoehn commented 9 years ago

I'm trying to modify git-merge-po so that it accepts an option like -Xours. However, I have a hard time understanding how it works, not least because the documentation of the gettext tools is rather thin and I don't know what really happens when I run a command. So it would be very nice if you could clarify a bit how and why git-merge-po works.

In particular, I encountered one problem: according to the comment, ${TEMP}.local-only should only contain messages that were changed only on local. However, look at the last message:

#, fuzzy
msgid ""
msgstr ""
"#-#-#-#-#  temp.local-changes (Local header)  #-#-#-#-#\n"
"Project-Id-Version: Local header\n"
"Content-Type: text/plain; charset=UTF-8\n"
"#-#-#-#-#  temp.conflicts (Remote header)  #-#-#-#-#\n"
"#-#-#-#-#  temp.remote-changes (Remote header)  #-#-#-#-#\n"
"Project-Id-Version: Remote header\n"
"Content-Type: text/plain; charset=UTF-8\n"
"#-#-#-#-#  temp.local-changes (Local header)  #-#-#-#-#\n"
"Project-Id-Version: Local header\n"
"Content-Type: text/plain; charset=UTF-8\n"

msgid "This little piggie is removed from remote, changed on local"
msgstr "3.local"

# msgid "This little piggie is removed from local, unchanged on remote"
# msgstr "4"
# msgid "This little piggie is removed from local, changed on remote"
# msgstr "5"
msgid "This little piggie is changed on local, unchanged on remote"
msgstr "6.local"

msgid "This little piggie is changed on remote and local"
msgstr "8.local"
mezis commented 9 years ago

Hi @rmoehn! The Gettext executables are fairly thoroughly documented my their manpages, for instance msgcat(1). GNU also has a description of the PO file format which might help.

So it would be very nice if you could clarify a bit how and why git-merge-po works

The idea behind git-merge-po is to treat PO files as "binaries", and use only Gettext utilities to manipulate them — this because order in those files, or comments for instance, can change dramatically between branches even when the actual (semantic) contents haven't.

This is possible because (admittedly with some juggling!) the various utilities (msgcat, msgmerge) can be used to separate split and reassemble changed messages on each branch.

However, look at the last message

I'm assuming this is the temporary file output by the test suite ; it is strange indeed, and not what I'd expect. I'd have to refresh my memory (just back from holidays) ; can you confirm the test suite passes for you?

I'm trying to modify git-merge-po so that it accepts an option like -Xours

An option could be to add conditionals around "the big merge" change, as -Xours (-Xtheirs) would only need that step changed.

Something like this should work, but I'll let you get familiar with the tools and amend the test suite accordingly:

# when -Xours (in case of conflict, use the local branch messages)
m_msgcat --use-first -o ${TEMP}.merge1 ${TEMP}.unchanged ${TEMP}.local-changes ${TEMP}.remote-only
rmoehn commented 9 years ago

The test suite passes for me. I just checked again that the strange thing still occurs by resetting to master, inserting cat ${TEMP}.local-only >> /tmp/lo.po and running rake2.1 spec again.

For -Xours I tried something similar to what you're suggesting, but was a bit impatient with the tools. I mean, I did read the manpages, but already the first sentence of the description is messed up: »The msgcat program concatenates and merges the specified PO files.« Concatenating and merging are two quite different activities, I thought. If the manpage would explain afterwards how this seemingly contradictory statement makes sense in the context of PO files, it would be okay. But the next sentence makes it worse. So msgcat concatenates and merges PO files as well as finds messages in them? Maybe it even rewrites every occurence of Bielefeld with Nowhere. Erm, but sorry for ranting in this place. I guess you're not involved in msgcat development.

mezis commented 9 years ago

I'm not indeed, and the Gettext authors seem to follow a practice of crypticly terse manpages ;)

My attempt at: msgcat merges PO files and concatenates translations for identical message IDs. When concatenating (i.e. when there's a non-empty message string for a given ID in more than one input file), it also inserts "conflict markers" in the concatenated message string.

When passing --use-first, it will pick the first (non-empty, I think) translation for each message ID amongst all the inputs.

rmoehn commented 9 years ago

Thanks for the explanation. I've experimented a bit and perhaps found out why ${TEMP}.local-only contains a message that was also changed on remote. You determine the local-only things with this line:

m_msgcat -o ${TEMP}.local-only  --unique ${TEMP}.local-changes  ${TEMP}.conflicts

However, --unique appears not to recognize definitions that are marked fuzzy. – If I remove #, fuzzy, it works.

rmoehn commented 9 years ago

I also tried your suggestion for the -Xours option. It doesn't quite work, because neither of ${TEMP}.unchanged, ${TEMP}.local-changes and ${TEMP}.remote-changes contains a definition for "This little piggie is added on local and remote, with different values". This again is because extract_changes really only extracts changes and not additions. The definition for "This little piggie is added on local and remote, with different values" only comes in during the second merge using this msgmerge template thing. I'm not sure whether you intended it like that or not.

rmoehn commented 9 years ago

Messing around with msgcat and friends is all too confusing to me. I now implemented the merge-favouring-ours in Python with polib and that was much easier.

Not closing this issue, because you Shell script – although it might yield the correct results – still doesn't work as one would expect from reading it.